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mCH THROUGHPUT METHODS FOR HAPLOTYPING 

FIELD OF THE INVENTION 
The invention relates to high throughput methods for single nucleotide 
5 polymorphism (SNP) haplotyping. In particular, the methods involve analysis of 
polymorphic loci of a nucleic acid using techniques involving hybridization, primer 
extension, MALDI TOF, HPLC, and/or fluorescence detection. 

BACKGROUND OF THE INVENTLON 
In recent years, genetic alterations which cause or contribute to many difGsrent 
10 diseases have been identified. A few of the diseases associated with genetic alterations 
are genetically simple and are associated with a sia^le genetic alteration. Once the 
genetic alteration associated with a genetically simple disease is identified, 
characterization and diagnosis of the disease is relatively simple. Most phenotypic traits 
and diseases, however, are genetically complex. The genetic complexity can arise as a 
15 result of the interaction or disruption of multiple genes, incomplete penetrance, genetic 
heterogeneity, and/or envuromnental/random causes (phenocopy). (Lander, E.S. and 
Schork, N.J., Science, 265:2037-2048 (1994)). Mapping of complex traits or diseases 
requires that the entire genome be scanned in order to identify all genomic regions that 
potentially contribute to the development of that trait or disease. In general, genome 
20 wide scans are performed using polymorphic DNA markers to determine which markers 
segregate with a complex trait of interest The loci which are identified as contributing 
to a disease can then be mapped to specific genomic regions based on the known 
chromosomal locations of the markers segregating with or "linked" to that trait. 

Several types of DNA polymorphisms or markers occur in the human genome 
25 and can be used in genome wide scans. These include restriction firagment length 
polymorphisms (RFLPs), microsatellites or simple sequence length polymorphisms 
(SSLPs), and single nucleotide polymorphisms (SNPs). 

RFLPs are single nucleotide changes Opoint changes or insertion/deletion 
changes) which alter a restriction site and thus the digestion pattern of a given segment 
30 of DNA. RFLPs were the first type of polymorphism identified and were used as a tool 
to construct early genetic linkage maps in humans. RFLPs are unsuitable for a large 
scale analysis of populations, however, because they are unreliable and not amenable to 
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automation. RFLPs are unreliable when used to analyze genetically-related individxxals, 
because RFLPs have only two alleles, one with the restriction site and one without and 
related individuals generally have the same allele on both chromosomes. Additionally, 
RFLPs are not amenable to automation because RFLP detection requires the use of 
5 Southern Blot techniques which are not easily automated. 

Microsatellite markers or SSLPs are sequences that are repeated in tandem, with 
the number of repeats resulting in multiple alleles of different lengths. Microsatellite 
markers are useful for identifying genes involved in traits which follow simple 
Mendelian, monogenic pattems of inheritance. Microsatellites, however, have proven to 

10 be unsuitable for studies involving traits which follow non-Mendelian complex pattems 
of inheritance because microsatellites are not optimally abundant, occurring only once 
every few kilobases. Microsatellites also have a high mutation and recombmation rate 
which makes them genetically unstable. Microsatellite markers are not amenable to high 
throi^put analysis becaiise they can only be analyzed using PCR and gel-based assays, 

15 which require a substantial investment in labor and time as well as cost. 

SNPs are single base pair positions in the genome at which different sequence 
alternatives (alleles) ^st in the population at frequencies of greater than 1%. SNPs are 
extremely stable and dense within the genome, but are not optimally informative because 
they only identify a single loci, and thus have low statistical power. 

20 SUMMARY OF THE INVENTION 

The invention relates to a high throughput method for SNP-based haplotyping, 
which is capable of assessing multiple alleles in large numbers of genomic samples. 
SNP haplotype analysis is much more informative than single SNP loci analysis because 
it enables the analysis of complex traits. Each haplotype segregates as a contiguous set 

25 of alleles within families and consideration of multiple closely-linked marker loci can 
provide a larger nimiber of alleles, each of low frequency. If a chromosomal region has 
multiple polymorphic loci, none of which are individually very informative, then 
haplotypes of these loci can be used to define a new locus with a heterozygosity and 
informativeness significantly beyond that of any single marker contained therein. The 

30 high throughput method of SNP-haplotyping described and claimed herein provides 
improved methods for SNP-haplotypmg that can dramatically increase the rate of 
haplotype analysis and enable large scale haplotyping studies. 



wo 01/75163 



PCTAJSOl/10173 



-3- 

In one aspect the invention is a method for haplotyping. The method involves 
analyzing a first polymorphic locus of a nucleic acid within a sample by specifically 
capturing the nucleic acid on a surface wherein the step of capturing the nucleic acid on 
the surface identifies a first allele of a first SNP of the polymorphic locus, repeating the 
5 analysis of the first polymorphic locus of the nucleic acid to identify a second allele of 
the first SNP of the polymorphic locus, separately analyzing a second SNP of a 
polymorphic locus of the nucleic acid sample to identify both alleles of the second SNP, 
and determining the haplotype based on the identification of each allele of each SNP. 
The term "separately" refers to analysis in discreet physical locations. Although the first 
10 and second SNPs are analyzed separately, they may be analyzed simultaneously. The 
different SNP alleles may also be analyzed on the same surface (i.e. surface of a slide) as 
long as they are analyzed on different spots or disoreet locations of the slide from one 
another. 

In some embodiments the second SNP is analyzed using a method selected firom 
15 the group consisting of hybridization, primer extension, MALDI TOF, and HPLC. In one 
preferred method the second SNP is analyzed by hybridization of the nucleic acid sample 
with an ASO complementary to a first allele of the second SNP and an ASO 
complementary to a second allele of the second SNP. 

In other embodiments the nucleic acid is captured by hybridization with an ASO, 
20 and wherein the ASO is fixed to a surface. Preferably a first ASO complementary to a 
first allele of the first SNP and a second ASO complementary to a second allele of the 
first SNP are hybridized to the surface and are used to capture the nucleic acid. 

In some embodiments each ASO corresponding to an allele of the first SNP 
fiirther includes a spacer sequence. Preferably the spacer sequence is selected from the 
25 group consisting of a poly-T, poly-A, poly-C, and poly-G. 

In this embodiment each of the ASOs correspondmg to an allele of the second 
SNP may be hybridized independently to the nucleic acid sample. Alternatively the 
alleles of the second SNP are analyzed simultaneously with one another. 

In preferred embodiments at least one of the ASOs complementary to an allele of 
30 the first SNP and at least one of the ASOs complementary to an allele of the second SNP 
contains a fluorescent label or quencher, the fluorescent label or quencher of the two 
ASOs, being distinct fi:om one another. 
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The surface may be any type of solid support, such as, for instance, a multiwell 
dish, a chip, a slide or a bead. 

The nucleic acid sample may be prepared by any method known in the art. For 
instance the nucleic acid may be prepared by PCR ampUfication of a polymorphic locus 
5 jfrom a genomic DNA sample. Altematively, the nucleic acid sample may be a reduced 
complexity genome. In some embodiments the nucleic acid sample is labeled with a 
first label. 

According to other embodiments the presence of one set of alleles at the 
polymorphic locus is associated with a disease and the haplotyping method is performed 

10 to identify predisposition to the disease. 

The methods for haplotyping may also involve analysis of more than two SNPs. 
In one embodiment a third SNP of a polymorphic locus of the nucleic acid sample is 
analyzed to identify both alldes of the third SNP, and the haplotype is determine based 
on the identification of each allele of each SNP. In another embodiment a fourth SNP of 

15 a polymorphic locus of the nucleic acid sample is analyzed to identify both alleles of the 
fourth SNP, and the haplotype is detemiine based on the identification of each allele of 
each SNP. Many genes are known to have multiple SNPs, for example the APOE gene 
has a reported 23 variable sites. Therefore the haplotyping technology of the invention 
involves the analysis of haplotypes containing multiple (i.e. >2) SNPs, e.g., using a 

20 microaxray-based haplotypmg method that can determine the haplotype for any number 
of SNPs with just two hybridizations. 

The invention in other aspects relates to a method for haplotyping by analyzing a 
genotype of a first SNP of a polymorphic locus of a nucleic acid within a sample in 
solution by detecting the presence or absence of a first labeled probe which specifically 

25 identifies a first putative allelie of the SNP and detecting the presence or absence of a 
second labeled probe which specifically identifies a second putative allele of the SNP, 
separating the nucleic acid sample based on the genotype of the first SNP, and analyzing 
a second SNP of the polymorphic locus of the separated nucleic acid samples to identify 
the haplotj^e of the nucleic acid. 

30 In preferred embodiments the analysis of the first SNP is performed using 

fluorescence detection and the nucleic acid sample is separated using flow cytometry. 
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In other embodiments the second SNP is analyzed usmg a method selected from 
the group consisting of hybridization, primer extension, MALDI TOF, and HPLC. 

According to another aspect of the invention a method for haplotyping is 
provided. The method involves labeling &st and second SNPs of a polymorphic locus 
5 of a nucleic acid within a sample in solution with a first, second, third, and fourth labeled 
probe which specifically identifies a first and second putative allele of the first SNP and 
a first and second putative all.ele of the second SNP respectively, separating the labeled . 
nucleic acid sample into single nucleic acid molecules, detecting the presence or absence 
of the first, second, third, and fourth labeled probes on the single nucleic acid molecules 
10 to identify the haplotype of the nucleic acid. 

In one embodiment the probes are labeled with fluorescence molecules and 
optionally each of the fluorescent molecules of the labeled probes is spectrally distinct 
The invention in another aspect is a method for haplotyping by performing four 
hybridization reactions on a nucleic acid sample, each of the four hybridization reactions 
15 involving one labeled probe specific for one allele of one of two SNPs, each of the 
labeled probes labeled with a spectrally distinct label and wherem each label on the 
probe specific for a first of the two SNPs is a spectral pair with the label on each probe 
specific for the second of the-two SNPs, bringing each of the labeled probes in each 
hybridization reaction within energy transfer distance from one another, exciting one of 
20 the labeled probes in each hybridization reaction, and detecting electromagnetic radiation 
released from the other labeled probe as a signal, wherein the presence or absence of a 
signal for each hybridization reaction is an indicator of the haplotype of the nucleic acid 
sample. The method can be performed in solution or on a surface. 

In some embodiments each hybridization reaction is performed in a separate 
25 vessel. In other embodiments the labeled probes are brought within energy transfer 
proximity of one another using binding partners, such as avidin and biotin. In yet other 
embodiments the labeled probes are labeled ASOs. 

A Idt is provided according to other aspects of the invention. The kit includes 
• one or more containers housing: a first set of ASOs, wherein the first set of ASOs 
30 represents two ASOs, each containing one of the two alleles of a first SNP in a 

polymorphic locus, a second' set of ASOs, wherein the second set of ASOs represents 
two ASOs, each containing one of the two alleles of a second SNP in the polymorphic 
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locus, and instructions for perfonning a hybridization reaction to determine a haplotype 
from a genomic DNA sampk using the &st and second sets of ASOs. 

Optionally the kit may include a set of PGR primers for amplifying the 
polymorphic locus of the genomic DNA sample. 
5 In some embodiments the first set of ASOs are fixed to a surface and the second 

set of ASOs are labeled. 

In other embodiments the ASOs include a spacer and the spacer sequence is 
selected from the group consisting of a poly-T, poly- A, poly-C, and poly-G. 

Each of the limitations of the invention can encompass various embodiments of 
10 the invention. It is, therefore, anticipated that each of the hmitations of the invention 
involving any one element or combination of elements, can be included in each aspect of 
the invention. 

BRIEF bESCRIPHON OF THE DRAWINGS 
Figure 1 is a flow chart and diagram depicting an raemplary method for 
IS performing high throughput SNP haplotyping analysis. 

Figure 2 is a diagram depictmg an exemplary arrangement for performing 
hybridization reactions to detemune haplotype. 

Figure 3 is a diagram depicting each potential result resulting from a double 
hybridization method for a single chromosome at a polymorphic locus. There are four 
20 possible haplotypes, each individually depicted in one of the four rows. Columns A-D 
refer to the hybridization surfaces schematically pictured in Figure 2. 

Figure 4 is a diagram depicting examples of nucleic acid samples labeled with 
different fluorescent labels, to exemplify the single molecule detection methods. 

Figure 5 is a graph depicting data generated from the haplotyping of 4 individuals 
25 (column sets 1-4). Haplotypes for each individual are as follows: #l-homozygote A-G, 
#2-homozygote A-G, #3- heterozygote G-G, A-G, #4-homozygote A-A. 

Figure 6 is a diagram depicting four graphs that generated from the haplotyping 
of 4 individuals (graphs 1-4). Haplotypes for each individual are as follows: #1- 
homozygote G-C, #2- heterozygote G-T, G-C, #3- heterozygote G-C, C-C, #4-- 
30 heterozygote G-T, C-C. 



BRIEF DESCRIPTION OF THE SEQUENCES 
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SEQ ID NO. 1 is a PGR primer for M13^"^— CCTCAGTGACATCCTTGCCT. 

SEQ ID NO. 2 is a PGR primer for M13 GATGGCCATTGTTGTGTGGT. 

SEQ ID NO. 3 is a SNP1-(G detecting) oUgo: NH2-(T)i5AGTGTCGC(C)TTTCGCT. 

SEQ ID NO. 4 is SNP1-(T detecting) oUgo: NH2-(T)i5AGTCTCGC(A)CTTTGGGT. 
5 SEQ ID NO. 5 is SNP2-(G detecting) oligo: AGGGTGGT(G)GGAGAGGT. 

SEQ ID NO. 6 is SNP2-(T-detecting) oUgo: AGGGTGGT(A)CGAGAGGT. 

SEQ ID NO:7 is a PGR forward primer: p*04]-ACTTGAGAGCGAGTGTGCTG. 

SEQ ID NO:8 is a PGR reverse primer: GTGGGTTTGGTGGGTGAG. 

SEQ ID NO:9 is a BAR-G oUgo: NH2-(T)23GAGGGAATGGAAGGCAT. 
10 SEQ ID NO:10 is a BAR-A oUgo: NH2-(T)23CAGCCAATAGAAGCCAT. 

SEQ ID NOrll is a BARPHGcc oligo: AGGAAATCGGCAGCTGT. 

SEQ ED N0:12 is a BARPHAcc oUgo: AGGAAATCAGCAGCTGT. 

SEQ ID N0:13 is abiotinylated BAR-PDG oUgo: [Bio]-AGGAAATCGGCAGCTGT. 

SEQ IDN0:14 is abiotinylated BAR-PDA oUgo: [Bio]-AGGAAATGAGGAGCTGT. 
15 SEQ ID N0:15 is a PGR forward primer GAAGAGGAATGCAGATTACGATGG. 

SEQ IDN0:16 is aPGR reverse primer: GTGTGAAGTATTTCTGGGCAGGATA. 

SEQ ID NO: 17 is an amine-labeled 403SAmG oligo: 

NH2(T)23GCCACAATGAATGACAT. 

SEQ ID NO:18 is an amine-labeled 4035AmC oligo: 
20 NH2(T)23GCGACAATGAATGACAT. 

SEQ ID NO: 19 is a 4035GoldGompG oligo: ATGTGATTGATTGTGGG. 

SEQ ID NO:20 is a4035GoldGompG oligo: ATGTCATTCATTGTGGC. 

SEQ ID N0:21 is a biotinylated 4035-CB oUgo: Biotin-TGTATAATCAGAATTAT. 

SEQ ID NO:22 is a biotinylated 4035-TB oligo: Biotin-TGTATAATTAGAATTAT. 
25 SEQ ID NO:23 is a cold competitive 4035-G oHgo: TGTATAATGAGAATTAT. 

SEQ ID NO:24 is a cold competitive 4035-T oligo: TGTATAATTAGAATTAT. 

DETAILED DESGRIPTION 
In recent years, certaiii' diseases have been identified where the occurrence of 

certain polymorphic haplotypes are associated with either an increase in susceptibility for 
30 developing disease or with differences in the onset progression and/or severity of the 

disease. These diseases include, for example, multiple sclerosis (Kaltnan, B. and Lublin, 

F.D., BiomedandPharmacother.. 53:358-370 ^/PPP^), insulin-depaident diabetes 
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mellitus iDeschammps, I ondKhalil I, Diabetes Metabol Rev,, 9:71-92 (J 993)), and 
narcolepsy (Billiard, M. and'Seignaled, 1, Lancet, 1:226-227 (1985)). The haplotypes 
associated with these disease susceptibility loci have been identified using standard 
technology, such as RFLP or microsatellite analysis. Alzheimer's Disease is one of the 
5 few diseases where the predisposing haplotype is composed of cormnonly inherited 
SNPs {Corder, KK, et al, Science, 261:921''923 (1993)). These SNPs occur in the 
epsilon allele of the apolipoprotein E locus (APOE). In Alzheimer's Disease, 
approximately half of late-onset familial and sporadic Alzheimer's Disease (i.e., 
development of disease by the age of 70 years) is associated with inheriting the e4/e4 

10 SNP haplotype. Individuals inheriting any other combination of the e2, e3, or e4 alleles 
have a significantly reduced risk of developing late-onset Alzheimer's Disease, with the 
lowest risk for Alzheimer's Disease being associated with the e2/e3 haplotype. 
Additionally, the precise haplotype an individual inherits can also predict the age of 
onset of disease from younger than 70 years for the e4/e4 haplotype to greater than 90 

15 years for the e2/e3 haplotype, a more than 20 year shift in susceptibility. 

The invention involves the identification of high througlq)ut methods for 
screening DNA to identify polymorphic haplotypes and to enable identification of 
haplotypes associated with predisposition to these and other diseases as well as other 
genetically associated traits. The high throughput method is based on the analysis of 

20 SNPs. In one aspect the invention involves the use of a capture step to analyze the SNPs. 

Two types of SNP haplotyping methods have been described in the prior art, the 
3' mismatch PCR-SSP and SMD methods. 3' mismatch PCR-sequence-specific primers 
(PCR-SSP) or allele-specific amplification (ASA) is a PCR-based method which utilizes 
a primer pair such that the 3' base on each primer represents one of the SNPs within a 

25 SNP1/SNP2 haplotype. The primers are mixed with the nucleic acid sample and allowed 
to anneal each to its respective SNP within the nucleic acid sample and PGR is 
performed. If the nucleic acid sample contains both SNPl and SNP2, a PGR product 
will be produced and detected in a gel or hybridization method. If the nucleic acid 
sample does not contain either or lx)th SNPl or SNP2, flien no PGR product will be 

30 produced. By mixing different sets of 3' mismatch primers, one can detemiine the 
haplotype of the SNPs in the targeted genomic region by detennining which primer set 
results in a PGR product One of the disadvantages of this method is that it requires 
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extensive methodology including optimization of PGR conditions for every primer pair 
utilized and electrophoretic analysis. These methods are not conducive to high 
throughput analysis of haplotypes. 

Single molecule dilution (SMD) is a method which involves serial dilution of 
5 genomic DNA until an average of one molecule or haploid equivalent of DNA per 5-10 
aliquots is reached. After dilution, a multi-step PGR reaction known as a booster PGR is 
performed with each of the 5-10 aliquots. The PGR reactions are analyzed by gel 
electrophoresis and then with dot blot hybridization or direct sequencing to determine 
haplotypes. There are many disadvantages associated with this technique, including the 

10 many laborious steps which prevent high throughput screening, the increased likelihood 
of shearing due to the dilution steps, and sensitivity of tiie reaction to any DNA 
contamination. * 

In order for a marker to be effective in genetically dissecting complex traits in 
genome wide scans, the marker should be abundant, stable, informative, amenable to 

15 high throughput analysis, have high scoring powa:, and be useful in linkage 

disequilibrium analysis. The ability for a marker to be amenable to high throughput 
analysis is very important. Due to the genetic complexity with which most phenotypic 
traits and diseases arise, genome wide scan analysis requires the genotyping of thousands 
of individuals in order to achieve adequate statistical power. The polymorphic markers 

20 used, thus, must be amenable to a high throughput and cost efficient method of analysis 
in order to analyze the extremely large numbers of samples required. The SNP-based 
methods of the invention are high throughput, whereas the 3' mismatch PCR-SSP and 
SMD methods are not. 

Scoring power is also important. Scoring power refers to the degree of ease, 

25 accuracy and reliability with which a marker's presence or absence can be determined in 
the genome that is being analyzed. In order to genotype thousands of test genomes in a 
time and cost efficient manner, the polymorphic marker must be easily scored as either 
present or absent and this scoring must be accurate and reliable without the need for 
secondary rounds of testing. Neither RFLP marker analysis nor microsatellite marker 

30 analysis for complex traits are amenable to high throughput analysis or scoring power. 
The scoring power of microsatellite markers is low, since the detenmnation of whether a 
marker is present or absent requires highly skilled labor for reading gels because of the 
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difBcxilty associated with distinguishing alleles at each locus on gels. The SNP 
haplotyping methods of the invmtion have high scoring power. 

Linkage disequilibrium analysis is the preferential association within popxilations 
of one allele of one locus with another allele of another locus, at a frequency greater than 
5 that expected by chance. {Brookes, A J., Gene, 234:177-186 (1999)). If a new 
polymorphic allele develops within a grouping of other polymorphic alleles at other 
contiguous loci and is so closely linked with these other alleles that recombination within 
that region is very unlikely, then, as the disease allele becomes replicated over tune, all 
the alleles would be repUcated together. Thus, linkage disequilibrium would have been 

10 estabUshed between the disease allele and the alleles within that grouping. This is 

important because the existence of linkage disequilibrium between polymorphic alleles 
would enable an allele of one polymorphic marker to be used as a beacon to locate the 
specific allele of another polymorphism. Linkage disequilibrium is a powerful tool that 
is useful in locating genes involved in the development of a complex trait. This tool can 

15 only be used, however, if the polymorphic markers are extremely stable and not prone to 
recombination events vMxAi would dismpt tiie DNA sequence of the polymorphic 
marker and very dense such that a suiBBlcient number of markers will be found in a non- 
recombinatorial distance to the complex trait alleles, thereby assuring their association. 
SNPs are extremely stable and dense witiiin the genome and thus have high statistical 

20 power as polymorphic markers for Imkage disequilibrium studies. 

The high throughput SNP haplotyping method of the invention overcomes many 
of the problems with the prior art methods of haplotyping. The methods of the invention 
which involve either capture of specific SNPs and/or solution phase detection are 
amenable to high throughput 'iand allow the simultaneous discrimination and haplotyping 

25 of multiple SNP loci for both* chromosomes of an individual. Methods can be performed 
on many nucleic acid samples at a time, thus, providing massive qiiantities of haplotype 
information, which is useful in characterizing complex traits and diseases. Additionally, 
the methods provide fewer false readings than some prior art methods. 

In one aspect, the invention is a method for haplotyping, which involves the 

30 specific capture of a nucleic acid on the surface. The method involves analyzing a first 
polymorphic locus of a nucleic acid within a sample by specifically capturing the nucleic 
acid on a surface wherein the'step of capturing the nucleic acid on the surface identifies a 
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first allele of a first SNP of the polymorphic locus, repeating the analysis of the first 
polymorphic locus of the nucleic acid to identify a second allele of the first SNP of the 
polymorphic locus, analyzing a second SNP of a polymorphic locus of the nucleic acid 
sample to identify both alleles of the second SNP, and determining the haplotype based 
5 on the identification of each allele of each SNP . 

Haplotyping is a process of genetic analysis which involves identifying genetic 
markers within a linked genetic region. The term haplotype is derived fi:om the phrase 
"haploid genotype" and refers to the allelic constitution of a single chromosome or 
chromosomal region at two or more loci. The term has developed two variant uses in llie 

10 field of human genetics. The first use of haplotype refers to the arrangement of alleles 
along a given section of a chromosome and is firequently used in association with disease 
mappings and studies to identify which closely linked polymozphic markers in a number 
of affected individuals are held in common by descent firom a common ancestor who 
possessed the founder chromosome. The second use of the temi haplotype refers to a 

1 S small genetic region within which recombination is very rare, such that specific allelic 
combinations of polymorphic markers are seldom, if ever, disrupted by meiotic 
recombination. As a result, linkage disequilibriimi exists and certain allelic 
recombinations will occur in the population much more firequently than would be 
expected by chance while other combinations will occur much less firequently. The 

20 haplotype analysis described herein is consistent with the second use of the term 
haplotype. 

Thus the term "haplotype" as used herein, refers to an ordered combination of 
alleles in a defined genetic region that co-segregate. Such alleles are said to be "linked," 
The alleles of the haplotype may be within a gene, between genes, or in adjacent genes 
25 or chromosomal regions that co-segregate with high fidelity. 

The term "linkage" refers to the degree to which regions of a nucleic acid are 
inherited together. DNA on different chromosomes are inherited together 50% of the 
time and do not exhibit linkage. 

The term "linkage disequilibrium" refers to the co-segregation of two alleles at a 
30 linked loci such that the fi'equency of the co-segregation of the alleles is greater than 
would be ^pected firom separate firequencies of occurrence of each allele. 
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In one method for SNP haplotyping, at least one of the two polymorphic loci is 
analyzed using a c^ture step. A nucleic acid within a sample is specifically captured on 
a surface in order to identify the first allele of a first SNP of the polymoiphic locus. The 
nucleic acid can be captured by any method known in the art for sequence-specific 
5 nucleic acid capture. For instance, an allele-specific oligonucleotide (ASO) which is 
complementary to a sequence spanning the first SNP of the polymorphic locus of the 
nucleic acid may be attached to the surface then caused to interact with the nucleic acid 
by a hybridization reaction. Alternatively, any binding molecule, which is specific for 
the first SNP of the polymorphic locus of the nucleic acid, may be used to bind and 

10 interact with the nucleic acid to capture it on the surface. Additionally, a binding 

molecule, such as an ASO, which is linked to a first binding partner, such as streptavidin, 
may be allowed to hybridize or interact with the first SNP region of the nucleic acid 
within the sample to form a Complex. This complex may then be interacted with a 
surfece containing a second Uindmg partner, such as biotm, attached thereto. Other 

15 methods for capturing a nucleic acid in a sequence-specific manner will be apparent to 
those of ordinary skill in the art. For instance primer extension, oligonucleotide ligation 
assay (OLA) or a combination of binding partner-ASO hybridization can be used. 

Binding partner-ASO hybridization is a method which involves a tag attached to 
an ASO which can specifically hybridize to a nucleic add. The tag is a binding partner 

20 which can specifically bind to another molecule and thus capture the ASO or 
ASO/nucleic acid complex. Binding partners include for instance biotin, avidin, 
flourescein, anti- flourescein mtibodies, other antigens and antibodies, haptens, chemical 
groups which are capable of specifically interacting with specific compounds, nucleic 
acids that can specifically hybridize with nucleic acids attached to a surface. 

25 The capture step is carried-out for each of the two alleles of the first SNP of the 

polymorphic locus of the nucleic acid in the sample. The capture steps performed on the 
first and second allele may be the same (i.e., both may involve allele-specific 
hybridization of the nucleic acid sample to an ASO attached to a surface) or different 
(i.e., analysis of the first allele may involve allele-specific hybridization and capture of 

30 the second allele may involve use of binding partners). It is important to identify using a 
capture step both the first and second alleles of the first SNP. Thus, it is important to 
determine the identity of both alleles of the first SNP within the nucleic acid sample. 
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Once flie first two alleles of the first SNP are identified, both alleles of the second 
SNP are identified to detennine the haplotype. The alleles of the second SNP may be 
determined using any methods known in the art for identifying SNPs. These methods 
mclude, but are not limited to, hybridization, primer extension, MALDI-TOF, HPLC, 
5 solution phase detection, and fluorescence detection. 

Methods for identifying alleles of a SNP using hybridization include the methods 
described above. For instance, an ASO/nucleic acid sample complex, which is 
hybridized to a surface as described above, may be subjected to a second hybridization 
reaction to detect the identity of the second SNP in the nucleic acid sample. In this 

10 method, probes such as ASOs, which are complementary to both potential alleles of the 
second SNP, can be separately hybridized to the ASO/nucleic acid sample complex 
attached to the surfece to ideiitify the presence of the second SNP. If the probe or ASOs 
are labeled, the presence of the bound label can be detected to determine the presence or 
absence of the hybridization reaction. 

15 Primer extension can idso be used to identify the alleles of the second SNP. 

Primer extension is performed by hybridizing primers which flank but do not span the 
second SNP, performing a primer extension reaction to produce a PGR product The 
primers may hybridize directily to the nucleic acid adjacent to the polymorphic site or 
they may hybridize to a site \^ch is some distance away. It is possible to determine 

20 which allele is present in the nucleic acid sample in one of several ways. For instance, if 
one possible allele is a G at the polymorphic site then a labeled G can be added to the 
primer extension mixture instead of an unlabeled G. In some cases the labeled 
nucleotide is a dideoxynucleotide which will stop the production of the strand being 
created. The label may be any type of detectable label, e.g., a fluorescent label or a 

25 binding partner, e.g., biotin. ' 

MALDI-TOF (matrix-assisted laser desorption ionization time of flight) mass 
spectrometry provides for thei spectrometric determination of the mass of poorly ionizing 
or easily-firagmented analytes of low volatility by embeddmg them in a matrix of light- 
absorbing material and measuring the weight of the molecule as it is ionized and caused 

30 to fly by volatilization. Combinations of electric and magnetic fields are applied on the 
sample to cause the ionized material to move depending on the individual mass and 
charge of the molecule, U.S. Patent No. 6,043,031, issued to Koster et al., describes an 
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exemplary method for identifying single-base mutations within DNA using MALDI- 
TOF and other methods of mass spectrometry. Other methods are described in U.S. 
Patent No. 6,002,127; 5,965,363; 5,905,259; 5,885,775; and 5,288,644, each of which is 
incorporated by reference. 

5 HPLC (high performance liquid chromatography) is used for the analytical 

separation of bio-polymers, based on properties of the bio-polymers. HPLC can be used 
to separate nucleic acid sequences based on size and/or charge. A nucleic acid sequence 
having one base pair difference from another nucleic acid can be separated usmg HPLC. 
Thus, nucleic acid samples, which are identical except for a single allele may be 

10 differentially separated using HPLC, to identify the presence or absence of a particular 
allele. Preferably the HPLC is dHPLC (denatured HPLC). dHPLC involves the 
denaturation of the nucleic add sample, followed be a reannealing step where the nucleic 
acid can assume a secondary istructure, which will differ somewhat in nucleic acid 
samples having different alleles. 

15 In some embodiments, the ASO or oth^ probes or binding molecules is jBxed to a 

surface. A surface, as used herein, refers to any type of solid support material to which a 
molecular component such as an ASO is capable of being fixed. Sur&ces include, for 
instance, single or multi-well dishes, chips, slides, membranes, beads, agarose or other 
types of solid support mediums. 

20 The nucleic acid sample being analyzed is any type of nucleic acid in which 

potential SNP-haplotypes exist For instance, the nucleic acid sample may be an isolated 
genome or a portion of an isolated genome. An isolated genome consists of all of the 
DNA material from a particular organism, i.e., the entire genome. A portion of an 
isolated genome, which is referred to herein as a reduced complexity genome (RCG), is a 

25 plurality of DNA fragments within an isolated genome but which does not include the 
entire genome. Genomic DNA comprises the entire genetic component of a species 
excluding, applicable, mitochondrial and chloroplast DNA. Of course, the methods of 
the invention can also be used to analyze mitochondrial, chloroplast, etc., DNA as well. 
Depending on the particular species of the subject being analyzed, the genomic DNA can 

30 vary in complexity. For instance, species which are relatively low on the evolutionary 
scale, such as bacteria, can have genomic DNA, which is significantly less complex than 
species higher on tibie evolutidnary scale. Bacteria, such as E coli have q)pioximately 2.4 
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X 10^ grams/mol of haploid genome, and bacterial genomes having a size of less than 
about 5 million base pairs (5 megabases) are known. Genomes of mtemiediate 
complexity, such as those of plants, for instance, rice, have a genome size of 
approximately 700-1000 megabases. Genomes of highest complexity, such as maize or 
5 humans, have a genome size of approximately 10^-10^\ Humans have approximately 
7.4 X 10^^ grams/mol of haploid genome. 

The methods of the invention are useful for identifying haplotype information in 
subjects. A subject, as used herein, refers to any type of DNA-containing organism, and 
includes, for example, bacteria, virus, fungi, animals, including vertebrates, and 

10 invertebrates, and plants. 

A "RCG" as used herein is a reproducible fraction of an isolated genome which is 
composed of a plurality of DNA fragments. The RCG can be composed of random or 
non-random segments or arbitrary or non-arbitrary segments. The term "reproducible 
fraction" refers to a portion of the genome which encompasses less than the entire native 

1 S genome. If a reproducible fraction is produced twice or more using the same 

experimental conditions the fractions produced in each repetition include at least 50% of 
the same sequences. In some embodiments the fractions include at least 70%, 80%, 
90%, 95%, 97%, or 99% of tie same sequences, depending on how the fractions are 
produced. For instance, if a JlCG is produced by PGR another RCG can be generated 

20 under identical experimental conditions having at a minimum greater than 90% of the 
sequences in the first RCG. Other methods for preparing a RCG such as size selection 
are still considered to be reproducible but often produce less than 99% of the same 
sequences. 

A "plurality" of elements, as used throughout the application refers to 2 or more 
25 of the element. A "DNA fragment" is a polynucleotide sequence obtained from a 

genome at any point along the genome and encompassing any sequence of nucleotides. 

The DNA fragments of the invention can be generated according to any one of two types 

mechanisms, and thus there are two types of RCGs, PCR-genemted RCGs and native 

RCGs, ^ 
30 The nucleic acid sample may be prepared using conventional PGR amplification 

of a polymorphic locus from a genomic DNA sample using known primers. 

Alternatively PCR-generated RCGs are randomly primed. That is, each of the 
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polynucleotide fragments in the PCR-generated RCG all have common sequences at or 
near the 5' and 3' end of the fragmmt (When a tag is used in the primer, all of the 5' and 
3' ends are identical. When a tag is not used the 5' and 3' ends have a series of N's 
followed by the TARGET sequence (reading in a 5' to 3' direction). The TARGET 
5 sequence is identical in each primer, with the exception of multiple-primed DOP-PCR) 
but the re m ainin g nucleotides within the jfragments do not have any sequence relation to 
one another. Thus, each polynucleotide fragment in a RCG includes a common 5' and 3' 
sequence which is determined by the constant region of the primer used to generate the 
RCG. For instance, if the RCG is generated using DOP-PCR (described in more detail 

10 below) each polynucleotide fragment would have near the 5 ' or 3 ' end nucleotides that 
are determined by the 'TARGET nucleotide sequence". The TARGET nucleotide 
sequence is a sequence whic^ is selected arbitrarily but which is constant within a set or 
subset (e.g, multiple primed bOP-PCR) of primers. Thus, each polynucleotide fragmmt 
can have the same nucleotide sequence near the 5' and 3' end arising from the same 

15 TARGET nucleotide sequence. In some cases more than one primer can be used to 

generate the RCG. When more than one primer is used, each member of the RCG would 
have a S' and 3' end in common with at least one other member of the RCG and, more 
preferably, each member of the RCG would have a 5' and 3' end in common with at 
least 5% of the other members of the RCG. For example, if a RCG is prepared using 

20 DOP-PCR with 2 different primers having different TARGET nucleotide sequences, a 
population containing of four sets of PCR products having common ends could be 
generated. One set of PCR products could be generated having the TARGET nucleotide 
sequence of the first primer at or near both the 5' and 3' ends and another set could be 
generated having the TARGfiT nucleotide sequence of the second primer at or near both 

25 the 5' and 3' ends. Another set of PCR products could be generated havmg the 
TARGET nucleotide sequence of the second primer at or near the 5' end and the 
TARGET nucleotide sequence of the first primer at or near the 3 ' end. A fourth set of 
PCR products could be generated having the TARGET nucleotide sequence of the 
second primer at or near the 3' end and the TARGET nucleotide sequence of the first 

30 primer at or near the 5 ' end. The PCR generated genomes are composed of synthetic 
DNA fragments. 
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The DNA fragments of the native RCGs have arbitrary sequences. That is, each 
of the polynucleotide fragments in the native RCG do not have necessarily any sequence 
relation to another fragment of the same RCG. These sequences are selected based on 
other properties, such as size or, secondary characteristics* These sequences are referred 
5 to as native RCGs because they are prepared from native nucleic acid preparations rather 
than being synthesized. Thus they are native-non-synthetic DNA fragments. The 
fragments of the native RCG may share some sequence relation to one another (e.g. if 
produced by restriction enzymes). In some embodiments they do not share any sequence 
relation to one another. 

10 In some preferred embodunents, the RCG includes a plurality of DNA fragments 

ranging in size from approximately 200 to 2,000 nucleotide residues. In a preferred 
embodiment, a RCG includes,from 95 to 0.05% of the intact native genome. The 
fraction of the isolated genome which is present in the RCG of the invention represents 
at most 90% of the isolated genome, and in preferred embodiments, contains less than 

15 50%, 40%, 30%, 20%, 10%, 5%, or 1% of the genome. A RCG preferably includes 
between 0.05 and 1% of the intact native genome. In a preferred embodiment, the RCG 
encompasses 10% or less of an intact native genome of a complex organism. 

Several methods can be used to generate PCR-generated RCG including IRS- 
PCR, AP-PCR, DOP.PCR, Multiple primed PCR, adaptor-PCR and multiple-primed- 

20 DOP-PCR. Hybridization conditions for particular PCR methods are selected in the 
context of the primer type and primer length to produce to yield a set of DNA fragments 
which is a percentage of the genome, as defined above. PCR methods have been 
described in many references; see e.g., US Patent Nos. 5,104,792; 5,106,727; 5,043,272; 
5,487,985; 5,597,694; 5,731,171; 5,599,674; and 5,789,168. Basic PCR methods have 

25 been described in e.g., Saiki et al., Science, 230: 1350 (1985) and U.S. Pat. Nos. 

4,683,195, 4,683,202 (both issued Jul. 18, 1987) and U.S. Pat. No. 4,800,159 (issued Jan. 
24, 1989). 

Another method for generating RCGs is based on the developmmt of native 
RCGs. Several methods can \>q used to generate native RCGs, including DNA fragment 
30 size selection, isolating a fiction of DNA from a sample which has been denatured and 
reannealed, pH-separation, s^aration based on secondary structure, etc. 
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Size selection can be Hised to generate a RCG by separating polynucleotides in a 
genome into different fiactiohs wherein each fraction contains polynucleotides of an 
approximately equal size. One or more fractions can be selected and used as the RCG. 
The number of fractions selected will depend on the method used to fragment the 

5 genome and to fractionate the pieces of the genome, as well as the total number of 
fractions. In order to increase the complexity of the RCG, more fractions are selected. 
One method of generating a RCG involves fragmenting a genome into arbitrarily size 
pieces and separating the pieces on a gel (or by HPLC or another size fractionation 
method). A portion of the gel is excised, and DNA fi^agments contained in the portion 

10 are isolated. Typically, restriction en2ymes can be used to produce DNA fragments in a 
reproducible manner. 

Different nucleic acid* sources may be used to generate RCGs. For instance, 
mitochondrial DNA can be isolated and used as the source of the RCG. 

Separation based on secondary structure can be accomplished in a maimer similar 

15 to size selection. Different fractions of a genome having secondary structure can be 
separated on a gel. One or niore fractions are excised from the gel, and DNA fragments 
are isolated therefrom. 

Another method for creating a native RCG involves isolating a fraction of DNA 
from a sample which has been denatured and reannealed. A genomic DNA sample is 

20 denatured, and denatured nucleic acid molecules are allowed to reanneal under selected 
conditions. Some conditions allow more of the DNA to be reannealed than other 
conditions. These conditions are well known to those of ordinary skill in the art. Either 
the reannealed or the remaining denatured fractions can be isolated. It is desirable to 
select the smaller of these two fractions in order to generate RCG. The reannealing 

25 conditions used in the particular reaction determine which fraction is the smaller fraction. 
Variations of this method can also be used to generate RCGs. For instance, once a 
portion of the fraction is allowed to reanneal, the double stranded DNA may be removed 
(e.g., using column chromatography), the remaining DNA can then be allowed to 
partially reanneal, and the reannealed fiction can be isolated and used. This variation is 

30 particularly tiseful for removing repetitive elements of the DNA, which rapidly reanneal. 

The amount of isolated genome used in the method of preparing RCGs will vary, 
depending on the complexity of the initial isolated genome. Genomes of low 
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complexity, such as bacterial genomes ha\dng a size of less than about 5 million base 
pairs (5 megabases), usually are used in an amount from approximately 10 picograms to 
about 250 nanograms. A more preferred range is from 30 picograms to about 7.5 
nanograms, and even more preferably, about 1 nanogram. Genomes of intermediate 
5 complexity, such as plants (for instance, rice, having a genome size of approximately 
700-1,000 megabases) can be used in a range of from approximately 0.5 nanograms to 
250 nanograms. More prefiarably, the amount is between 1 nanogram and 50 nanograms. 
Genomes of highest complexity (such as maize or humans, having a genome size of 
approximately 3,000 megabases) can be used in an amount from approximately 1 

10 nanogram to 250 nanograms (e.g. for PGR). 

In other aspects of the invention, the nucleic acid sample can be an entire or a 
portion of an RNA genome. RNA gaiomes differ from DNA genomes in that they are 
generated from RNA rather than from DNA. An RNA g&nomc can be, for instance, a 
cDNA preparation made by reverse transcription of RNA obtained from cells of a subject 

15 (e.g. human ovarian carcinonaa cells). Thus, an RNA genome can be composed of DNA 
sequences, as long as the DNA is derived from RNA. RNA samples can also be used 
directly. 

Each of the types of nucleic acid samples set forth herein is described in more 
detail in co-pending U.S. Patent Application No. 09/404,912, filed on September 24, 

20 1999, which is hereby incorporated by reference. 

The methods of the invention involve analysis of at least two SNPs to identify the 
haplotype. The two SNPs are referred to as SNPl or the first SNP and SNP2 or the 
second SNP. The reference to a first or second SNP does not provide an indication of 
the order of the SNPs on the nucleic acid. A "single nucleotide polymorphism" or 

25 "SNP" as used herein is a single base pair (i.e., a pair of complementary nucleotide 

residues on opposite genomic strands) within a DNA region wherein the identities of the 
paired nucleotide residues vairy from individual to individual. At the variable base pair 
(alleles) in the SNP, two or niore alternative base pairings can occur at a relatively high 
fi^umcy in a subject, (e.g. human) population. 

30 A "polymorphic region" is a region or segment of DNA the nucleotide sequence 

of which varies fix>m individual to individual. The two DNA strands which are 
complementary to one another except at the variable positions are referred to as alleles. 
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A polymorphism is allelic because some members of a species have one allele and other 
members have a variant allele and some have both. When only one variant sequence 
exists, a polymorphism is referred to as a diallelic polymorphism. There are three 
possible genotypes in a diallelic polymorphic DNA in a diploid organism. These three 
5 genotypes arise because it is possible that a diploid individual's DNA may be 

homo2ygous for one allele, homozygous for the other allele, or heterozygous (i.e. having 
one copy of each allele). When other mutations are present, it is possible to have 
triallelic or higher order polymorphisms. These multiple mutation polymorphisms 
produce more complicated genotypes. 

10 A "polymorphic locus", as used herem, refers to a region of a nucleic acid that 

includes more than one single nucleotide polymorphism. 

In one embodiment, flie method for haplotyping involves a bi-phasic allele- 
specific oligonucleotide hybridization technology. Briefly, the method is carried out as 
shown in Figure 1 for a haplotype consisting of two SNP loci. During the first phase of 

15 the method, a SNPl allele-specific oligonucleotide (ASO) is synthesized and attached to 
a surface. A nucleic acid sample is then prepared using a method such as amplification 
of a genome to produce a nucleic acid sample containing the polsonorphic locus. 
Optionally, the nucleic acid Sample can be labeled. The sample is then allowed to 
hybridize to the SNPl ASO coated on the surface to produce a SNPl/nucleic acid sample 

20 complex. Excess is removed. In Phase 2 a SNP2 ASO, which is labeled is synthesized 
and allowed to hybridize to the SNPl ASO/nucleic acid sample complex. The entire 
surface is then scanned and the haplotype can be scored. The SNPl ASO is actually a 
set of ASO which includes two allele-specific oligonucleotides correspondmg to an anti- 
sense version of each allele of two SNPs of a polymorphic locus. The second set of 

25 ASOs corresponds to the second SNP (SNP2) of the polymorphic locus. In the method, 
the nucleic acid sample will only hybridize to the ASO of the first set of ASOs, which is 
anti-sense to the allele of the first SNP m the genomic sample. Likewise, only the ASO 
of the second set of ASOs, which is anti-sense to the allele of the second SNP present in 
the nucleic acid sample, will hybridize. These haplotyping methods for identifying 2 

30 SNPs generally involve the analysis of 4 wells. If a subject is homozygous, analysis of 
their DNA will result in a signal in one well. If a subject is heterozygous, analysis of 
their DNA will result in a signal ui two wells. 
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In general, the high throughput SNP h^lotyping methods of the invention are 
useful in linkage diseqmlibrium studies for the analysis of complex traits to localized 
genes involved in diseases such as diabetes, multiple sclerosis, and asthma; diagnostic 
analysis to determine the presence or absence of a predisposing disease haplotype or 
5 other trait; pharmacogenomic analysis to identify haplotypes that correlate with either 
positive or negative responses to drugs and development; genome-wide scan studies for 
complex trait analysis using SNP haplotypes, uistead of single SNPs, to increase the 
statistical power; etc. 

Deletions, multiplications, or substitutions in genes can result in genetic disease. 

10 Most of these deletions, multiplications, or substitutions, causing multiple alleles, 
produce indistinguishable or distinguishable ''normal" phenotypes. For instance, 
multiple alleles produce variable characteristics like eye color. Some genetic alterations, 
however, are associated witii'clinical disease like sickle cell anemia. The haplotyping 
methods of the invention are usefiil for identifying both normal phenotypes and disease 

15 phenofypes. Thus, the methods for llie mvention are useful for identifying traits such as 
eye color as well as for diagnostics to determine presence or absence of predisposing 
disease haplotype in a subject Some diseases which are known to have a genetic 
element include colon cancer, breast cancer, cystic fibrosis, neurofibromatosis type 2, 
LiFraxmieni disease, Vonffippel-Lindau disease, thalassemia, ornithine, 

20 transcarbamylase deficiency, hypoxanthine-guanine-phosphoribosyl-transferase 
deficiency, phenylketonuria, etc. 

Another recentiy identified phenomenon is that the inheritance of varying 
haplotypes within the same gene can alter a disease phenotype altogether. This is 
exemplified by polymorphic mutations in the prion gene, PrP. (Goldfarb, L.G. and 

25 Petersen, R.B., Science, 258:806-808 (1992)). Individuals that mherit a SNP 

polymorphism at codon 178 of the PRP gene, will develop a Creutzfeldt-Jakob disease. 
If the individual also inherits a concomitant SNP polymorphism at codon 129 of the PRP 
gene, then that individual will develop a fatal familial insomnia instead. Therefore, the 
precise haplotype inherited can change the effect of the mutations involved, resulting in 

30 distinctiy different phenotypic diseases. The methods of the invention are useful for 
making these types of distinctions. 
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Identification of haplotypes associated with phenotypic traits is useful for many 
purposes in addition to identifying predisposition to disease. For example, identification 
of a correlation between susceptibility to a particular drug or a therapeutic treatment and 
specific genetic alterations is particularly usefiil for tailoring therapeutic treatments to a 
5 specific individual. The methods are also useful in prenatal screening to identify 
whether a fetus is afflicted with or is predisposed to develop a serious disease. 
Additionally, this type of information is useful for screening animals or plants bred for 
the purposes of enhancing or exhibiting desired characteristics. 

Other methods for high throughput haplotyping, according to the invention, 

10 involve the identification of SNPs in solution. In one aspect the invention is a method 
for haplotyping by analyzmg a genotype of a first SNP of a polymorphic locus of a 
nucleic acid within a sample in solution by detecting the presence or absence of a first 
labeled probe vMch specifically identifies a first putative allele of the SNP and detecting 
the presence or absence of a second labeled probe which specifically identifies a second 

15 putative allele of the SNP, separating the nucleic acid sample based on the genotype of 
the first SNP, and analyzing a second SNP of the polymorphic locus of tiie separated 
nucleic acid samples to identify the hq)lotype of the nucleic acid. 

The first and second allele of the first SNP of the polymorphic locus are detected 
in solution using labeled probes. The labeled probes are any type of molecule which 

20 specifically binds to one allele of the SNP and not the other and which include a 

detectable label. The molecule which specifically interacts with one of the two alleles 
can be any type of molecule, for instance, it may be a DNA-specific binding protein or 
an ASO complementary to the allele containing DNA. A label may be a light-emissive 
label, radioactive label, etc. Light-emissive labels can be added to the molecule or may 

25 be naturally-occurring within the molecule. For instance, some bases of a nucleotide are 
naturally-occurring light-emissive labels. In the case when a naturally-occurring light- 
emissive label is used, an extrinsic label does not need to be added to the molecule. 
Light-emissive labels, v^ch can be added to molecules include fluorophors and 
quenchers, light-scattering particles (such as gold particles which scatter light), etc. 

30 Radioactive labels include, but are not Ihnited to, ^H, ^^P, and ^^S. The use of each of 
these types of labels is well-known to those of ordinary skill in the art 
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Once the first SNP has been identified xising a labeled probe, the nucleic acid 
sample is separated such that the DNA molecules containing the first allele are in a 
separate container firom the DNA samples containing the second allele. One method for 
accomplishing this separation is through the use of flow cytometry e.g., using a 
5 fluorescence-activated cell sorter (FACS). Flow cytometry analysis involves the 

separation of single molecules based on the presence of a particular fluorescence marker. 
Thus, a nucleic acid molecule which includes a labeled probe that emits in the red light 
wavelength will be separated firom the nucleic acid molecules hybridized to a labeled 
probe which emits light in the green wavelength. Once the two samples are separated, 

10 each can be separately analyzed to identify the presence or absence of an allele at the 
second SNP. Other methods for separating the nucleic acid samples based on the allele 
present in the sample, include but are not limited to (1) the use of an ASO attached to 
different size beads which can be separated by size, afSnity, or weight, and (2) the use of 
tags such as binding partners which can be separated based on their specific binding 

15 interactions. 

In other aspects, the invention involves solution phase analysis that utilizes four 
labeled probes, each specific-for an allele of the two SNPS. In this analysis, the labeled 
probes are allowed to interact with the nucleic acid sample to form complexes. The 
labeled complexes are then separated such that each nucleic acid complex is separate 

20 firom one another. Thus, this analysis is based on single molecule detection strategies. 
Each individual nucleic acid is separated from other nucleic acid molecules. The 
separate nucleic acid molecules are then detected for the presence or absence of each of 
the four labeled probes. 

The method can be accomplished, for example, as shown in the schematic 

25 diagram of Figure 4. In the fi^gure, it is shown that each single nucleic acid sample, 

which has been separated, includes two of the four labeled probes, one ^ecific for a first 
allele of the first SNP and the other specific for an allele of the second SNP. In the 
examples shown, the first labeled probe includes a label which is stimulated in the red 
wavelength of light to produce a signal detected in the green wavelength. The second 

30 labeled probe is capable of detecting light in the orange wavelength and emitting light in 
the yellow wavelength. The single molecule, when subjected to light within the red 
wavelength spectrum, will emit green light that can be detected. Likewise, the sample. 
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when e^qposed to light within the orange wavelength, will emit yellow light. Thus, if the 
sample, when exposed to red and orange light, emits green and yellow, this is indicative 
that the first and third labeled probes, specifically identifying the first allele of the first 
SNP and the first allele of the second SNP are present. Alternatively, when light of red 
5 and orange wavelengths are used to stimulate the sample and blue and red light 

wavelengths are emitted, this is indicative that the second and fourth labeled probes have 
bound to the nucleic acid sample, tihius identifymg the presence of the second allele of the 
first SNP and the second allele of the second SNP. Other combinations would be 
indicative of other haplotypes which are possible in this two SNP system. Other 

10 combinations of labels can be used For instance, each of the 4 labeled probes can be 
labeled with a molecule that is stimulated in one wavelength of light, e.g., red, as long as 
each labeled probe emits in a-diflferent spectrum (or one may quench). Alternatively, 
each of the 4 labeled probes can be labeled with a molecule that is stimulated by distinct 
wavelengths of light, but all can emit in the same or different spectrums. 

15 In some preferred embodiments fluorescence detection of single molecules is 

used to identify the components of the polymorphic locus. A fluorescent label or 
fluorophore is a substance which is capable of exhibiting fluorescence within a 
detectable range. Fluorophor^s include, but are not limited to, fluorescein, 
isothiocyanate, fluorescein amine, eosin, rhodanmie, dansyl, umbelliferone, 5- 

20 ^carboxyfluorescem (FAM), 2!'7'-dimethoxy-4'5'-dichloro-6-carboxyfluorescein (JOE), 
rhodamine, 6 carboxyrhodamine (R6G), N,N,]Sf',N'-tetramethyl-6-carboxyrhodamine 
(TAMRA), 6-carboxy-X-rhodamine (ROX), 4-(4'-dimethylaminophenylazo) benzoic 
acid (DABCYL), 5-(2'-aminbethyl) ammonaphthalene-1 -sulfonic acid (EDANS), 4- 
acetamido-4'-isothiocyanatostilbene-2, 2'disulfonic acid, acridine, acridine 

25 isothiocyanate, r-atmno-N->3-vinylsulfonyl)phenyl!naphthalimide-3,5, disulfonate 
(Lucifer Yellow VS), N-(4-anilino-l-naphthyl)maleimide, anthranilamide. Brilliant 
Yellow, coumarin, 7-amino-4-methylcoumarin, 7-amino-4-trifluoromethylcouluarin 
(Coumaran 151), cyanosine, 4', 6"diaminidino-2-phenylindole (DAPI), 5', 5"- 
diaminidino-2-phenylindole (DAPI), 5\ 5"-dibromopyrogallol-sulfonephthalein 

30 (Bromopyrogallol Red), 7-diethylamino-3- (4'-isothiocyanatophenyl) -4-methylcoumarin 
diethylenetriamine pentaacetate, 4,4'-diisothiocyanatodihydro-stilbene-2, 2*-disulfonic 
acid, 4,4'-diisothiocyanatostilbene-2, 2'-disulfonic acid, 4- 
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dimethylaminophenylazophenyl-4'4sothiocyanate (DABITC), eosin isothiocyanate, 
erythrosin B, erythrosin isothiocyanate, ethidium, 5-(4,6-diclilorotriazin-2-yl) 
aminofluorescein (DTAF), QFITC (XRITC), fluorescamine, IR144, IR1446, Malachite 
Green isothiocyanate, 4-metliyliimbelliferone, ortho cresolphtiialein, nitrotyrosine, 
5 pararosaniline, Phenol Red, B-phycoerythrin, o-phthaldialdehyde, pyrene, pyrene 

butyrate, succinimidyl 1 -pyrene butyrate. Reactive Red 4 (Cibacron . RTM. Brilliant Red 
3B-A), lissamine rhodamine B sulfonyl chloride, rhodamine B, rhodamine 123, 
rhodamine X isothiocyanate, sulforhodamine B, sulforhodamine 101, sulfonyl chloride 
derivative of sulforhodamine 101, (Texas Red); tetramethyl rhodamine, tetramethyl 
10 rhodamine isothiocyanate (TRTTC), riboflavin, rosolic acid, and terbium chelate 
derivatives. 

Fluorescence is measiired using a fluorometer. The optical emission from the 
fluorescence molecule can be detected by the fluorometer and processed as a signal. 
When fluorescence is being measured in a sample fixed to various portions of the 

15 surfece, the surface can be mbved using a multi-access translation stage in order to 
position tiie different areas of the surface, such that the signal can be collected. Many 
types of flourometers have been developed. For instance, a new sensitive instrument for 
measuring FRET is described in U.S. Patent No. 5,91 1,952. 

In other aspects the invention is a method for haplotyping which is accomplished 

20 by performing four hybridization reactions on a nucleic acid sample, each of the four 
hybridization reactions involving one ASO specific for one allele of one of two SNPs, 
each of the ASOs labeled with a spectrally distinct label and wherein each label on the 
ASO specific for a first of the two SNPs is a spectral pair with the label on each ASO 
specific for the second of thetwo SNPs, bringing each of the labeled ASOs in each 

25 hybridization reaction within energy transfer distance from one another, exciting one of 
the labeled ASOs in each hybridization reaction, and detecting light released from the 
other labeled ASO as a signal, wherein the presence or absence of a signal for each 
hybridization reaction is an iiidicator of the haplotype of the nucleic acid sample. 
A process referred to as a molectilar beacon for nucleic acid detection has 

30 previously been described. The method involves the use of a probe which is in the form 
of a stem loop structure, such that the 3' and 5' ends of the nucleic acid probe are 
adjacent one anotiier in the stem sectiorL The 5' and 3' ends are labeled with a donor 
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fluorophore and a quencher. When this probe encounters a complementary nucleic acid 
within the sample, the secondary structure stem loop is destabilized and the 5' and 3 ' 
ends are moved away from one another. This causes the fluorescent group to emit light 
which is no longer quenched and thus an increase in fluorescence emission occurs. 
5 (Discussed in U.S. Patent No. 5,989,823). This type of analysis has been used in the 
detection of alleles within a nucleic acid sample. (Kostrikis, et al, Science, 279:1228- 
1229 (1998) and Tyagi, et al (1998), Nature Biotechnology 16:49-53)). 

This method is similar to the methods of the invention, except that binding 
partners are used to bring the fluorophores within proximity of one another. Binding 

10 partners are two molecules which specifically interact with one another when brought 
into proximity with one another. Many types of binding partners are known in the art. 
Some well known examples of a binding partners are biotin and avidin or streptavidin, as 
well as antibody and antigen. ' These binding partners are used to bring the regions of the 
nucleic acid housing the two SNPs within proxunity of one another. For instance, the 

15 first SNP may be labeled with an ASO which is conjugated to biotin. The second SNP 
may be hybridized with an ASO which is conjugates to avidin. Either the biotin or the 
avidin may contain fluorophores, which when brought within proximity of one another, 
will produce a signal or the ASO may contain the fluorophore label which would be 
brought in proximity with the other fluorophore label when the biotin and avidin mteract 

20 Streptavidin and biotin labeled with various fluorophores are commercially 

available firom several sources including Molecule Probes (Eugene, Oregon), Intergen 
(Purchase, NY) and NEN (Boston, MA). 

Fluorescence resonance energy transfer (FRET) is the transfer of electronic 
excitation energy by the Forster mechanism. FRET is useful for measuring the distance 

25 between a pair of fluorophores (donor and acceptor) which are in a range of 1 0-80 

angstroms from one another. FRET has previously been used to study the hybridization 
of complementary oligodeoxynucleotides (Cardullo etaL, PNAS, USA, 85:8790-8794 
(1988)), and various other binding assays. 

FRET arises fi:om certain fluorophores which when excited by exposure to a 

30 particular wavelength of light will emit light at a different wavelength. A donor 

fluorophore absorbs a photon of energy and transfers this energy non-radiatively to the 
acceptor fluorophore. When the excitation and emission spectra of two fluorophores 
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which are brought within cloise proximity of one anotibier overlap, tiie excitation of one 
fluorophoie will cause it to emit light at a wavelength that is absorbed and that can 
stimulate the second fluorophore causing it to fluoresce. During this process, the 
fluorescence of the donor molecxile is quenched and fluorescence intensity of the 
5 acceptor molecule is enhanced. If the donor is in proximity with a fluorophore which is 
a non-acceptor (referred to as a quencher), the fluorescence of the donor is still quenched 
but there is no subsequent eniission of fluorescence by the second fluorophore, or 
quencher. Thus, there is no emission of light. 

When selecting fluorophores for FRET analysis several parameters can be 

10 considered. U.S. Patent No. 4,996,143 describes some of the parameters that should be 
considered when designing fluorescent probes, such as the spacing of the fluorescent 
moieties and tiie length of th^ portion of the molecule which connects the fluorescent 
moiety to the base unit of the* nucleic acid. In order for FRET to occur, the donor and 
acceptor molecules should be within 100 angstroms of one another. When attached to 

15 nucleic acids, preferably, the donor and acceptor fluorophores are within 1 to 20 base 
pairs of one another for FRET analysis. Additionally, when performing FRET analysis 
flourophores which are spectral pairs should be used. Two fluorophores are spectral 
pairs when one of the two fluorophores emits light at a wavelength which either causes 
the other fluorophore to emit'light of a different wavelength or to quench the light 

20 emitted by the first fluorophore without producing additional Ught. For instance FAM is 
excited by light with a wave-length of approxhnately 488 nm and emits light with a 
spectrum of 500-650 nm. Thus, FAM is a suitable donor fluorophore for use with JOE, 
TAMRA and ROX, all of which have an excitation maximum of 514 nm and a spectral 
pair is formed when FAM is matched with either JOE, TAMRA or ROX. Appropriate 

25 spectral pairs among the known flourophores are well known to those of ordinary skill in 
the art. 

Examples 

Example 1 : Haplotype analysis of multiple mdividuals using a double 
hybridization method. 

30 

Two single nucleotide polymorphisms (spaced 212 nucleotides apart) commonly 
occur in the dopamine D2 receptor gene. SNPl is a T to G transversion at nucleotide 
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3208 and SNP2 is a C to T transition at nucleotide 3420 (Sarkar, a, et al. Genomics, 
11:8-14 (1991)). 

These polymorphisms were shown to be common in all races examined, with 
allele frequencies ranging jfrom 39% to 49% (Sarkar, G., et al., Genomics, 1 1 :8-14 
5 (1991)). This system has been previously used to demonstrate the 3' mismatch PGR- 
SSP haplotyping technique {Sarkar, G., etal, Biotechniques, 10:437-440 (1991)) . This 
system is also ideal for demonstrating tiie efficacy of the SNP-Haplotyping Method of 
the invention for the following reasons: 

1 . These polymorphisms are commonly represented in the population, i.e. 
1 0 exhibit high allele frequencies. 

2. They are in close proximity to one another facilitating the generation of PGR 
products containing both polymorphisms from test genomes. 

3 . They exhibit no linkage disequilibrium such that no single haplotype at this 
locus dominates in the population. 

15 L Verification of the double hybridization SNP-Hcplotyping method. 

To verify the feasibility, efficacy and reliability of the SNP-Haplolyping Method, 
haplotypes at the D2 receptor locus are determined for multiple individuals, using 
standard sequence analysis. The haplotypes determined by sequence analysis are used 
for comparison to the haplotypes determined by the SNP-Haplotyping Method of the 
20 invention. 

The sequencing step is performed as follows: 

1. Primer 1 (M13^°'>-GGTCAGTGACATCGTTGCGT) (SEQ ID N0:1) and Primer 2 
(M13 CATGCCCATTCTTCTCTGGT) (SEQ ID N0:2) flank the region 
containing SNP1/SNP2 of the Da receptor polymorphic locus. Primer 1 contains an 

25 Ml 3 forward sequence at the 5' end and Primer 2 contains an Ml 3 reverse sequences 
at the 5' end to facihtate sequencing of the PGR products. The expected size of the 
PGR product is 350 base pairs. 

2. DNA from unrelated individuals is obtained from the National Human Genome 
Research Institute (NHGRI) which is a database containing a standardized collection 

30 of DNA from 450 unrelated individuals. Primers 1 and 2 are used to PGR amplify 
the polymorphic locus from the D2 receptor gene from all of the individuals using a 
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proof-reading tiiermostable DNA polymerase such as Pfu polymerase, as previously 
described (Sarkar, G., et al.. Genomics, 11:8-14, (1991)). 

3. Each of the PGR reactions is separated by agarose gel electrophoresis and the PGR 
products cut from the gel and purified. These purified PGR products represent the D2 

5 receptor polymorphic locus from the individuals. 

4. The genotype of the polymorphic locus for each individual is determined by 
sequencing an aliquot of each purified PGR product using dye-labeled Ml 3 forward 
and reverse primers. 

5. The haplotype of the polymorphic locus for each individual is determined as follows: 
10 a) An aliquot of each purified PGR product is subcloned into a plasmid vector such 

as TA vector (Invitrogen), and transformed into the appropriate strain ofE. coll 
This results in multiple transformations, one for each individual. 

b) Six colonies are picked from each transformation and plasmid DNA is isolated 
from all colonies. Pickmg 6 colonies/transformation results in a >96% chance 

15 that the loci from both chromosomes (alleles) of each individual is represented 

and therefore analyzed. 

c) The plasmid inserts, representing the D2 receptor polymorphic locus for each 
individual, are sequeiiced using vector-specific primers. The sequences are 
analyzed to determine the haplotype of the SNP1/SNP2 locus for each of the 

20 individuals. 

Genotypes 

Nine combinations of (SNPl: SNP2) genotypes are possible for the two SNP loci 
and each of the tested individuals is expected to possess one of the following: 
25 G/T:G/T G/G : G/T T/T : G/T 

G/T : G/G G/G : G/G T/T : G/G 

G/T:TyT 'G/G: T/T T/T:T/G 

Haplotypes 

Four haplotypes of the two SNP loci (SNP1=G/T and SNP2+G/T) are possible: 
30 Haplotype I (G—C) Haplotype IH (T—G) 

Haplotype H (G— T) Haplotype IV (T— T) 
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Each chromosome will have its own haplotype for the two SNP loci, therefore, 
each individual is e^qpected to possess two haplotypes. Since the maternal and paternal 
chromosomes cannot be distinguished, ten possible haplotype combinations between the 
two chromosomes are possible and each individual is scored as possessing one of the 
5 following haplotype combinations: 

U II, n ni,in iv,iv 

i,n n,iii , mjv 

i,ni n,iv 
ijv 

10 

2 Genotype analysis of a Single Locus (the SNPl Locus of the D2 Receptor Gene) Using 
Phase lAllele-^pecific Oligonucleotide Hybridization on Immobilized SNPl Allele- 
Specific Oligonucleotides. 

15 The SNP-Haplotyping Method of tiie invention dq)ends, in some aspects, on the 

ability to discriminate between polymorphic loci usmg differential ASO hybridizations. 
The technique of ASO hybridization has been established in the literature (Wang et 
al, Science, ISOdOlJ-lOSl (1998); Quo, S. etaU Nucleic Acids Res,, 22:5456-5465 
(1994); Sapolsiy, K, etal. Genet Anal. Biomed, Engin. 14:187-192 (1999)). Phase I of 
20 this method involves the accurate genotyping of multiple individuals, for the SNP 1 

locus, using ASO hybridization techniques. The following protocol is outlined in Figure 
1, which outlines Phase I Hybridization Protocol and Expected Results. 

Step 1: Synthesis of anti-sense SNPl allele-specific oligonucleotides 
Step 2: Attachment of anti-sense SNPl allele-specific oligos to wells 
25 Step 3: Amplification of the SNP1/SNP2 polymorphic Locus from individiials 

using a Cy 3 -labeled PGR reaction 

Step 4: Hybridization'of Cy3-labeled PGR products from each individual to 
duplicate SNPl-(G) allele wells and duplicate SNP1-(T) allele wells 

Step 1 involves Synthesis of Oligonucleotides Representing the Antisense Strand 
30 of the Two SNPl Alleles. Two oligonucleotides are synthesized, each representing one 
allele of the SNPl (G/T) locus of the D2 receptor. The oligonucleotides represent the 
antisense (complementary) strand for each allele as follows: 
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SNPHG detecting) oUgo: NH2-(T)i5AGTCTCCC(C)TTTCCCT (SEQ ID N0:3) 
SNP1-(T detecting) oUgo: NH2-(T)i5AGTCTCCC(A)CTTTCCCT (SEQ ID N0:4) 

The amino group is added to facilitate binding to the surface of the wells. The 
addition of 15 Ts on the 5' end of the oligonucleotide functions as a "spacer" sequence. 
5 Spacer sequences have been shown to greatly enhance the hybridization signal, 
presumably by lifting the hybridization sequence off the support surface thereby 
decreasing the steric interference produced by that surface (Guo, S„ et aL, Nucleic Acids 
Res., 22:5456-5465 (1994)) , but are not essential. 

Step 2 involves Binding of Oligonucleotides to Solid Surface. Each 
10 oligonucleotide is covalently attached to one 96-well Xenobind Black plate (Xenopore, 
Corp., Hawthorne, NJ) as follows: 

a) 200 pmol of each oligonucleotide is resuspended in 0,05 M phosphate buffer, 
pH 7.0 and placed into the wells of a Xenobind plate. One plate contains 
SNPl-(G-detecting) oligos and the second plate contains SNP1-(T detecting) 

15 oligos. 

b) Plates are incubated at 37* C for 2 hours. 

c) Plates are washed once with 0.2% SDS and twice with ddH20. 

d) Unbound sites are blocked with blockmg solution (Ig sodium borate in 400 
mis of 25% ethanol in PBS) for 5 minutes. 

20 e) Wells are washed, as above, and air dried in the dark at room temperature. 

Step 3 involves Amplification of the Polymorphic Locus from the Test Subjects. The 
polymorphic locus (SNPl and SNP2) from the receptor gene is PCR-amplified from 
each of the individual test subjects. The primers used are the primers outlined above 
(SEQ ID N0:1 and 3), except that tiie M13 sequences are omitted. The PGR is canied 

25 out in the presence of Cy3-dCTP, to fluoiescently label the products, and the PGR 
products are purified using a PGR purification column system (QIAGEN). 

Step 4 involves Hybridization of the Polymorphic Locus PGR Products from the 
Individual Test Subjects to the SNPl-(G-detecting) oligo and SNP1-(T detectmg) oligo 
bound wells. 

30 a) Each of the 48 fluorescently labeled PGR products is denatured by boiling 

and diluted to 0.5 pmol/ml of TMAC hybridization solution (3.0 M 
TMAC/0.6% SDS/10 mM sodium phosphate pH 6.5/5X Denhardt's 
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solution/40 ug/ml yeast tRNA). TMAC (Sigma, Inc.) aUows the 
hybridizations to progress independent of G/C content and intrinsic melting 
temperatures. 

b) 200 jil each of the diluted PGR products is added in duplicate to two wells (# 
5 of individuals (i.e. 48 individuals) x 2 wells = 96 wells) of the SNP1-(G 

detecting) oligo plate and two wells of the SNP1-(T detecting) oligo plate. 
Hybridizations are incubated overnight at 52"* C with gentle agitation. 

c) Plates are washed twice for 30 minutes at room temperature and once for 20 
minutes at 54** C in wash buffer (3.0 M TMAC/0.6% SDS/10 mM sodium 

10 phosphate, pH 6.8) 

d) Plates are read on an Ultra Reader fluorescent microplate reader (Tecan 
Instruments) to detennine which wells have a positive hybridization signal. 

Control wells are set up to monitor background produced by non-specific binding 
to xmattached sites of the wells, insufficient blocking, random DNA/DNA interactions 

15 etc. To control for and subtract out background signals from the ASO hybridizations, a 
random segment of human- DNA is amplified firom genomic DNA. This random 
segment is of equal lengdi (350 base pairs) and of approximately equal G/C and A/T 
content to the locus specific PCR products from the test individuals. These control DNA 
segments are hybridized to Wells bound with each of the SNPl allele-specific oligos as 

20 outlined above. 

If the Cy3-labeled PCR product from an individual binds to the oligonucleotides 
attached to the wells, then a fluorescent signal is detected in that well. If the PCR 
product does not bind to the oligonucleotide attached to the well, then no fluorescent 
signal is detected. The following genotypes are possible: 

25 a) G/T heterozygote 

b) G/G homozygote , 

c) TT homozygote 

The hybridization patifem for each genotype is describe herein as the possible 
hybridization pattems expected for the SNPl (G/T) locus of the D2 receptor gene. G/T 
30 heterozygote, G/G homozygote, T/T homozygote. 

1. G/T Heterozygote : If an individual is a G/T heterozygote then the hybridization of 
the Cy3-labeled PCR product occurs m wells containing both tihte SNP-1(G detecting) 
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oligo and the SNP1-(T detecting) oUgo. As a result a fluorescent signal is detected 
for wells bound with oligonucleotides representing both SNPl alleles. 

2. G/G Homozygote : If an individual is a G/G homozygote then the hybridization of the 
Cy3-labeled PGR product occurs only in the wells containing the SNP1-(G detecting) 

5 oligo. As a result, a fluorescent signals are detected only in wells bound with 
oligonucleotides representing the SNPl-G allele, 

3. T/T Homozygote: If an individual is a T/T homozygote then the hybridization of the 
Cy3-labeled PGR product occurs only to the wells containing the SNP1-(T detecting) 
oligo. As a result, a fluorescent signal is detected only in wells bound with 

10 oligonucleotides representing the SNP 1 -allele. 

Control Wells: Negligible fluorescent signals are obtained with wells hybridized with 
control PGR products. Any biackground signal is subtracted from the fluorescent signals 
obtained with the test wells and these values are used as the adjusted results. 

The conditions outiined herein, particularly the use of TMAC in the hybridization 

15 reactions, have been shown to accurately discriminate single base mismatches in ASO 
hybridizations. In order to minimize any potential indiscriminate binding in the reaction 
the following optional steps Can be performed. 

1 . Cold competitor (unlabeled oligos of the opposite allele) is added to the hybridization 
reactions to compete-out binding to the indiscriminate allele. 

20 2. Enhanced discrimination of SNPs by artificial mismatch hybridization can also be 
used (Guo, Z., et al., Nature Biotech, 15:331-335 (1997)). It has been shown that the 
ability to discriminate bfetween 1 vs. 2 mismatches is 200% greater than the 
discrimination between 0 vs. 1 mismatch. Therefore, the difference in stability of 
hybridized DNA segments with 2 mismatches versus 1 mismatch is significantly 

25 greater than the stability differences between 0 mismatches and 1 mismatch. 
Therefore, the addition of an artificial mismatch into the SNPl allele-specific 
oligonucleotides bound {6 the wells, can be expected to abolish any non-specific 
binding between an individual's SNPl locus and the erroneous SNPl-specific allele 
oligonucleotide. 

30 
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3. Haplotype analysis of Two Loci (the SNPl andSNPl Loci of the D2 Receptor Gene) 
Using Phase I and II AUele-Specific Oligonucleotide Hybridization on Immobilized 
SNPl Allele-Specific Oligonucleotides. 

Phase I of the Method of Haplotyping (outlined above) involves hybridizing 

5 CY3-labeled PGR products from the SNP1/SNP2 loci of each individual to SNPl allele- 
specific oligonucleotides. Phase U involves an additional hybridization with SNP2 
allele-specific oligonucleotides which simultaneously determines: 1) the genotype of the 
SNP2 locus for each individual and 2) the phase of the SNP2 genotype with the SNPl 
genotype, in other words, the haplotype. 

10 SNPl Antisense Allele-specific Oligonucleotides are bound to the Wells of a 

Xenobind Black 96- Well Plate. SNPl-(G)-detectmg and SNPl-(T)-detecting antisense 
allele-specific oligonucleotides (SEQ ID N0:3 and 4) are synthesized. Each SNPl oligo 
is bound to two 96 well Xendbind Black plates (4 plates total) as outlined above. 

PGR Amplification and Phase I Hybridization of the SNP1/SNP2 Locus fcom the 

15 Individual Test Subjects to the SNP1-(G detecting) OUgo and SNP1-(T detecting) Oligo 
Bound Wells is performed- The SNP1/SNP2 locus is PGR amplified fix)m the test 
subjects in the presence of GyS^dGTP and hybridized to two wells each of the 4 plates 
containing inmiobilized SNPl (G) or (T) allele-specific oligos. However, after the last 
wash at 54° C, the plates are not read on a fluorometer. They are, instead, subjected to 

20 the next Step in the SNP-Haplotyping Protocol. 

Synthesis of Oligonucleotides Representing the Antisense Sequence of the Two 
SNP2 Alleles is performed neixt. Phase II Hybridization to the Immobilized SNPl allele- 
specific oligo:SNPl/SNP2 locus complex Bound to the Xenobind Plates. 

Oligonucleotides are synthesized, each representing one allele of the SNP2 (C/T) 

25 locus in the presence of Cy5-dCTP. The oligonucleotides represent the antisense 
sequence for each SNP2 allele as follows: 

SNP2-(G detecting) oligo: AGGGTGGT(G)GCAGAGGT (SEQ ID N0:5) 
SNP2-(T-detecting) oligo: AGGGTGGT(A)CCAGAGGT (SEQ ID N0:6) 

As outlined m Figure 1 and 2, the SNP2-(C)-oligo and SNP2-(T)-oligo are each 
30 hybridized to one Xenobind plate bound with the SNPl (G) oligo individual PGR 
products complex (Fig, 2, plates 2 & 4). Each SNP2 allele-specific oligonucleotide is 
diluted to O.S pmol/ml in TMAC hybridization solution + 50 X cold competitor. The 
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hybridization protocol is identical to the protocol described in above (Step 4). Plates are 
read on a fluorescent microplate reader which can differentiate Cy3 and Cy5 signals. 
Cy5 signals are read to determine haplotype. 

Phase n Hybridization Setup is shown in Figure 2. Phase II hybridization setup 
5 is represented in Plates A-D. Plate A) SNPl (G) allele-specific oligo is bound to the 
plate and Phase I-hybridized with 48 test genomes, each in duplicate wells. Plate A is 
then Phase n-hybridized with SNP2 (C) allele-specific oligo. Plate B) SNPl (T) allele- 
specific oligo is bound to the plate and Phase I-hybridized with 48 test genomes, each in 
duplicate wells. Plate B is then Phase Il-hybridized with SNP2 (C) allele-specific oligo. 

10 Plate C) SNPl (G) allele-specific oligo is bound to the plate and Phase I-hybridized with 
48 test genomes, each in dupHcate wells. Plate C is then Phase II;-hybridized with SNP2 
(T) allele-specific oUgo. Plate D) SNPl (T) allele-specific oUgo is bound to the plate, 
and Phase I-hybridized with 48 test genomes, each in duplicate wells. Plate D is then 
Phase n-hybridized with SNP2 (T) allele-specific oligo. 

15 Four haplotypes of the-two SNP loci (SNP1=G/T and SNP2=C/T) are possible: 

H^lotypeI(G— C) \" 
Haplotype n(G—T) 
Haplotype m(T—C) 
Haplotype IV (T—T) 

20 A schematic diagram . depicting the determination of haplotype is shown in Figure 

3, In Figure 3, the expected Phase II hybridization pattems are shown for the four 
possible haplotypes (rows lU) of the SNPl (G/T) locus and the SNP2 (C/T) locus. 
Columns A-D refer to Plates A-D as outlined in Figure 2. The (G-G) haplotype gives a 
positive signal on Plate A (well lA). The (G-T) haplotype gives a positive signal on 

25 Plate B (well 2B). The (T-C)' haplotype gives a signal on Plate C (well 3C). The (T/T) 
haplotype gives a signal on Plate D (well 4D). 

SNPl Genotype: If an individual possesses a G allele at the SNPl locus, then 
that individual's PGR product will hybridize only to those wells containing the SNPl- 

i 

(G)-detecting-oligo (Figure 3, wells 1 A, IB, 2A and 2B). If this individual however, 
30 possesses a T allele at the SNPl locus then hybridization will occur to tihiose wells 
containing the SNPl-(T>detecting-oligo (Figure 3, wells 3C, 3D, 40 and 4D). Since the 
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PGR products are Cy3 labeled, the genotype at SNPl can be determined by detecting 
v^ch wells have a Cy3 signal. 

Haplotyping for One Chromosome : The haplo^e for a particular chromosome is 
detennined by hybridizing with either the SNP2-(C) or the SNP2-(T) allele-specijBc 
5 oligo. If an individual has a"SNPl-SNP2 haplotype of G-C on one chromosome, then a 
Cy5 signal results when the SNP2 (C) specific oKgo binds to a (G-C) PCR product/SNPl 
(G) specific oligo complex bound to a well (Figure 3, well lA). If the mdividual, 
however, possesses a SNP1-SNP2 haplotype of G-T, then a Cy5 signal results when the 
SNP2 (T) specific oligo binds to a (G-T) PCR product/SNPl (G) specific oligo complex 

10 CPig- 3, well 2B). 

Haplotype for Both Chromosbmes : By examining the hybridization patterns, (i.e. vMcb. 
plates have wells with a GyS signal), the h^lotype can be determined for both 
chromosomes. As outlined in Figure 2: Plate A will detect SNPl (G)/SNP2 (C); Plate B 
will detect SNPl (G)/SNP2 (T); Plate C wiU detect SNPl (T)/SNP2 (Q and Plate D will 

15 detect SNPl (T)/SNP2 (T). Therefore, haplo^es can be scored for hybridization 
signals seen on each plate. They are as follows: 



Plate SNP1-SNP2 Haplotype 

A G-C I 

20 B G-T n 

C T-C ; m 

D T-T • IV 



Since each individual possesses two chromosomes, two haplotypes are expected to be 
detected per person. Since the maternal and paternal chromosome cannot be 
25 distinguished, ten haplotype combinations are possible. Their e^qjected hybridization 
patterns are outlined in Table. 1 . 

TABLE 1 

Hybridization Patterns Expected For All 10 
Possible Haplotype Combinations 

30 



Haplotype 


Plates Where Signal Is Detected For 
Each Individual 


1. 1 


A 


i,n 


A,B 


I, HI 


A,C 
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i,rv 


A,D 


n,n 


B 


n,in 


B,C 


n,iv 


B,D 


in,ni 


C 


ni,iv 


C,D 


IV, IV 


D 



The ten possible haplotype combinations for both chromosomes are indicated in 
the left column of the chart. The plate (A-D) v^ere the signal is expected to be detected 
5 for each haplotype combination, is indicated in the right colunm of the chart. It is 
expected that the haplotypes generated for the shufEled test samples will match the 
known haplotypes generated by sequencing. 

Example 2: Haplotyping Procedure 
10 Introduction 

The following is the protocol used to detect genetic haplotypes consisting of two 
Single Nucleotide Polymorphisms (SNPs) in the hximan Beta-Adrenergic Receptor 
(BAR) gene. The assay was conducted in 96 well plate format, in which a set of four 
wells was used to detect each sample's haplotype. In order to distinguish the allele 

15 present for the first SNP detected, an amine labeled oligo was bound to the surface of the 
wells. One pair of wells contain an oligo, BAR-G, complimentary to a 17 base pair 
region including one allele of the SNP, while the remaining two wells contained an 
oligo, BAR-A, complimentary to a 17 base pair region including the other allele of the 
first SNP. This creates a situation where the Beta-Adrenergic Recqjtor gene probe binds 

20 preferentially to one set of wells over the other if only one allele of Hiat SNP is present. 
If both alleles are present in the samples genome, the probe, produced using tihie 
Polymerase Chain Reaction (PGR), will bind to both sets of wells with proportionately 
equal success. 

Following binding of the an[une labeled oligo, the wells of the assay plate are 
25 washed to remove imbound oligo. The plate is then treated with blocking agents to 

prevent nonspecific binding of the subsequent hybridization and detection components to 
the wells. Further washing removes the blocking agent and prepares the assay plate for 
hybridization. ' 
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In the hybridization, the sample of interest's BAR PGR product probe is mtroduced into 
each of the four wells used in the haplotype detection. In order to detect the genotype of 
the second SNP, a pair of biotin-labeled oligos, BAR-PHG and BAR-PHA, 
complimentary to the 17 base pair region including the two SNP alleles of the second 
S SNP, are added as follows: 



Amine Oligo Biotin Oligo Positive Haplotype Detected 

Weill BAR-G BAR-PIIG G-G 

Well 2 BAR-G BAR-PIIA G-A 

10 Well 3 BAR-A BAR-PIIG A-G 

Well 4 BAR-A BAR-PHA A-A 

i 

Also added to the hybridization is either member of a pair of unlabeled cold 
competitor oligos. The cold competitor oligo DNA sequence is identical to the biotin- 
15 labeled oligo except does not contain a biotin label. The cold competitor oligo is added 
to the opposite wells as above and in some cases helps to enhance SNP discrimination. 
The addition of the cold competitor oligos is shown below. 



Cold Competitor Oligo 

20 WeU 1 BARPIIAcc 
Well 2 BARPIIGcc 
Well 3 BARPIIAcc 
Well 4 BARPIIGcc 

25 Hybridization occurs by incubating the components together in the assay plate 

wells overnight. Probe that is non-complimentary to the amine oligo is washed from the 
wells, consequently removing any biotin oligo from tiie wells as well. Biotin oligo is 
also removed from wells in which the probe binds the amine oligo at the jBbrst SNP 
location but does not contain the allele of tiie second SNP that is complimentary to the 

30 SNP present in the biotin oligo sequence. Thus, biotin-labeled oligo remains after post- 
hybridization washing only iri wells m which the probe is complimentary to the SNPs 
contained in the amine-labeled oligo and the biotin labeled oligo added to that well. This 
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well will produce a positive signal for tihie haplotype it detects. Wells that do not meet 
this criterion will produce no signal, a "negative" signal. 

The haplotypes present, of which there can be at most two of the four possible 
demonstrated haplotypes (one if the person has a homozygous haplotype), are detected 
5 by addition of a streptavidin-Horseradish peroxidase conjugate to the assay plate. Biotin- 
streptavidin binding ensures that the peroxidase remains in the wells containing the 
positive haplotypes. Detection takes advantage of this with the addition of peroxidase 
substrates that form a chemiluminescent product in the positive wells and relatively none 
in the negative wells. 

10 

HAPLOTYPING PROTOCOL 

Procedure 

PCR Product Preparation for Use as Haplotyping Hybridization Probe 

15 PCR Preparation 

The PCR reactions were prepared as follows: 

1. ABgene PCR Master Mix (Marsh Biomedical Products, Inc.,Rochester, N.Y., cat 
#AB-0575) 

20 Final cone. IX 

2. Forward primer 5' [P04]-ACTTGACAGCGAGTGTGCTG 3' (SEQ ID N0:7) 

3. Reverse primer 5' GTCCCTTTGCTGCGTGAC 3' (SEQ ID N0:8) 

Final primer concentration 0.1 )aM for each primer 

4. Genomic DNA template 
25 Final cone. 1 ng/jil 

Total reaction volume = 50 nl* 

The PCR reaction was conducted in PTC-225 DNA Engine Tetrad MJ Research 
30 (Waltham, MA.)using the following PCR profile: 



5 minutes at 94°C 
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1 minute at 96°C 
1 minute at 56°C 
1:30 minutes at 72''C 
Repeat from step 2, 34times 
5 10 minutes at 72°C 
4''C constant hold 

The final product of this PGR reaction was an 1 140 base pair fragment of the 
Beta-Adrenergic Receptor (BAR) gene sequence. 

10 

PGR Product Preparation Protocol 
A. Purification of PGR products 

Each 50 ]il PGR reaction was transferred to a MultiScreen-PCR Plate (MilUpore, 
Bedford, MA, cat. # MANU 030 50). The MultiScreen plate was placed on the vacuum 
15 manifold and the vacuum was engaged for ten minutes. Then 50 jil of H2O was added to 
each well of the multiScreen plate and the plate was shaken at 98 rpm at room 
temperature for five minutes. The DNA was eluted from the MultiScreen plate as 
described below in step B-3. 

20 B. Exonuclease Digestion 

Following the purification Lambda Exonuclease Digestion was preformed to 
produce single-stranded DNA. To do this; Gibco Lambda exonuclease, 6U/ul, (Life 
Technologies, Rockville, MD, cat. #28023-018) was added to 2X Lambda exonuclease 
buffer (lOmg/ml Glycine, 5mM MgCfe pH 9.4) at a rate of 1 :50. Then 50 ^il of the 

25 enzyme/buffer mix per well was dispensed from above to a Skirted Thermo-Fast 96 96 
well PGR plate (Marsh cat # AB-0800). The Millipore Multiscreen-PCR Plate purified 
PGR products were eluted and transferred to the PGR plate containing the 2X lambda 
exonuclease en2yme/buffer mixture. The PGR products were then digested for thirty 
minutes at 37^C in PTG-225 DNA Engine Tetrad (MJ Research) and the reaction was 

30 heated 10 mm. at 75**G to stop the reaction, 

G. Purification of Lambda Exonuclease Digested PGR Products 
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The 100|al lambda exonuclease digest was transferred to a Millipore Multiscreen 
plate and purified as described above. Then the ssDNA was eluted in 50jil H2O as 
described above, 

5 D. Quantification of Purified Lambda Exonuclease Digested PGR Product 

5O111I of the product ftom step C-3 above was placed into a COSTAR 96 well 
flatbottom UV plate (Coming, Inc., Acton, MA, cat. #3635). Then the sample DNA 
concentrations were determined by reading the A260 of the samples in the TEC AN 
Spectrafluor Plus (Durham, N.C.) compared to a standard curve of known ssDNA 
10 concentrations of equal voluipes. 

Haplotyping Hybridization Protocol 

A. Bind Amine Labeled Oligb to Well of Assay Plate 

15 Amine labeled oligo 

BAR-G 

5' NH2-(T)23CACCCAATGGAAGCCAT 3' (SEQ ID N0:9) 
BAR-A 

5' NH2-(T)23CACCCAATAGAAGCCAT 3' (SEQ ID NO:10) 

20 

150 pmol of amine labeled oligo was bound into the well of COSTAR DNA- 
BIND plate (Coming, Inc. cat #2498) in oligo binding buffer (100|j.l of Disodium 
phosphate, 50 mM, and EDTA, 1 mM, pH 8.5). Wells of the assay plate used for 
negative "no oligo" controls received 100|il of bmding buffer without amine-labeled 
25 oligo added. The oligo was allowed to bind by incubating for two hours at 37*^C while 
shaking at 98 rpm. 

B. Removal of Unbound Oligo from the Assay Plate 

Oligo binding biiffer was removed by inverting the assay plate and dumping the 
30 contents of the wells into the sink. The plate was then gently shaken to remove any 

residual droplets of binding buffer. Any amine oligo was removed by washing the wells 
of the assay plate three times ^th 200|il of IX PBS wash buffer. Washes were added 
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via a 12-channel pipettor and removed by the technique described above. One wash 
consisted of addition of the 200^il wash buffer to the dry assay plate and its removal. Ail 
subsequent washes were carried out in an identical manner. 

5 C. Block Assay Plate to Prevent Nonspecific Binding 

After removal of the third wash, the wells of the assay plate were blocked with 
200^1 blocking biiffer (Disodium phosphate, 50 mM, and EDTA, 1 mM, pH 8.5, 3% 
Bovine Serum Albumin, 0.05% Tween 20). The plate was allowed to block by 
mcubation for 1 hour at 3TC while shaking at 98 ipm. 

10 

D. Prehybridization Wash 

The plate was washed three times with IX PBS, 200ul per wash, as described 
above. Then the plate was washed once with 200^1 of TMAC-B solution (3M 
Tetramethyl-ammonium Chloride (Sigma-Aldrich Inc. St Louis, MO, product # T 341 1), 
15 50mM Tris pH 8.0, 0.1% SDS, ImM EDTA) and the wash was allowed to incubate for 7 
minutes at room temperature! 

E. Hybridization Solution Addition 

Hybridization soluttoh was prepared in 96 well PGR plate using the following 
20 procedure. 

Contents: 

0.4 pmol single-stranded BAR gene DNA probe 

100.0 pmol cold competitor to bind opposite allele of SNP2 on BAR PGR product 

25 

BARPHGcc 5' AGCjAAATCGGGAGGTGT 3' (SEQ ID N0:1 1) 
BARPIIAcc 5' AGiSAAATGAGGAGGTGT 3' (SEQ ID N0:12) 

3 . 1 00.0 pmol biotinylated SNP2 detection oligo 

30 

BAR-PIIG 5' [Bio]-AGGAAATGGGGAGGTGT 3' (SEQ ID N0:13) 
BAR-PIIA 5' [Bio]-AGGAAATGAGGAGGTGT 3' (SEQ ID N0:14) 
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4. The final volume was brought up to lOOul such that hybridization occurs in TMAC- 
B solution as described above. 

5 The Hybridization Solution was heated for 7 minutes at 9S°C in a PTC-1 00 

Peltier Thermal Cycler jfrom MJ research. Then 100^1 of hybridization solution was 
transferred to COSTAR DNA-BIND plate and hybridized overnight at Sl^'C with 
shaking at 98 rpm. 

10 F. Post-Hybridization Washes 

The plate was washed three times with TMAC-B solution at room temperature 
with mcubation in the third wash for 5 mmutes at room temperature with shaking at 98 
rpm. The plate was then wasiHed once with TMAC-B solution at 52^C, with the wash 
incubated for five minutes at 52°C, with shaking at 98 rpm. The plate was then washed 

15 twice with 2X SSC followed by one wash with oligo binding buffer. 

G. Perform Detection. 

lOOjil of blockmg buffer containing 1:500 Peroxidase-labeled Streptavidin 
(Kirkegaard and Perry Laboratories, cat. # 474-3000) was added to each well of the 
20 assay plate. The plate was incubated 30 minutes at 3TC with shaking at 98 rpm. 

H. Remove Unbound Detection Molecules 

The assay plate was washed three times with IX PBS + 0.05% Tween 20 wash 
buffer followed by washing twice with IX PBS wash buffer. 

25 

I. Detect Haplotype 

lOOiil of 1 : 1 SuperSignal ELISA Femto Luminol/Enhancer Solution and 
SuperSignal ELISA Femto Stable Peroxide Solution (Pierce Chemical Company, 
Rockford, IL., product #37075) was added and the assay plate shaken 1 minute at 98 rpm 
30 at room temperature. The assay plate chemiluminescence was tiien read using Tecan 
Spectrafluor Plus in chemiluminescence mode, with the gain set to 130. 
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Resiilts 

The graph in Figure 5 represents data generated from the haplotyping of 4 
individuals. The signal generated from a negative control well (no PGR product added) 
was subtracted from the signal generated for each of the four wells analyzed for each 
5 individual. The backgroimd-subtracted signal was plotted for each well. From this 
analysis, the determined haplotypes for each individual are as follovv^: #l-homozygote 
A-G, #2-homozygote A-G, *3- heterozygote G-G, A-G, #4-homozygote A-A. Sequence 
analysis of several subcloned BAR products for each individual have confirmed these 
haplotypes. 

10 

Example 3. Haplotyping Assay Using Asymmetric PCR Prodacts 

Introduction 

The hybridization described in Example 2 can also be perfbnned in two steps. 

15 The PCR product and an appropriate cold competitor are added to wells containing 
bound amine- labeled oligo. These components are allowed to hybridize, then washed 
thoroughly to remove nonspecific binding. The second set of hybridization components 
are then added to the wells and allowed to hybridize. In this protocol, detection was 
carried out using a Streptavidih-alkaline phosphatase conjugate. Fluorescent products 

20 were then formed upon the addition of alkaUne phosphatase substrates. 

An altemate method of preparing the PCR product probe utilizes a method 
known as asymmetric PCR. In this method, the strand of interest is produced at a much 
higher frequency than its compliment during the PCR reaction. This is accomplished by 
mcreasing the concentration of the primer that initiates replication offhe desired strand 

25 relative to the concentration 6f the primer producing the strand's compliment This 
results in a large number of copies of the strand of interest being produced that have no 
compliment with which to bind. These single-stranded DNA firagments are readily 
available to take place in the hybridization that follows. Therefore, digestion of the 
opposite strand via exonuclease is not necessary when the PCR product is produced with 

30 asymmetric primer concentrations. 
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The locus of interest in this procedure can be found using accession # GS4849 to 
search the Website of the National Center for Biotechnology Infoimation's (NCBI) 
Genbank Database. 



Procedure 

A. Hybridization Components 
Well* Haplotype Detected 



Amine oligo bound 



10 



1. 
2. 
3. 
4. 



G-T 
G-C .. 
C-T 
C-C • 



4035AmG 
4035AmG 
4035AmC 
4035AmC 



Phase I Components (An equal amount of an asymmetric PGR reaction was added to 
15 each well in this hybridization) 



20 



Well 



1. 
2. 
3. 
4. 



Cold Competitor 



4035ColdC6mpG 
4035ColdCompG 
4035ColdCompC 
4035ColdCompC 



Phase n Components 



25 Well 



Cold Competitor 



Biotinylated SNP Detection Oligo 



1. 
2. 
3. 
4. 



4035CCC 
4035TCC 
4035CCC 
4035TCC 



4035-TB 
4035-CB 
4035-TB 
4035-CB 



30 



Protocol for Haplotyping with Asynunetric PCR Product 
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PGR Preparation 

PGR reactions were performed as follows: 



1 . ABgene PGR Master Mix (Marsh Biomedical Products, Inc., cat. #AB-0575) 
5 Final cone. IX 

2. Forward primer (5' GAACAGCAATGGACATTAGCATGG 3') (SEQ ED NO:15) 

Final primer concentration 0.2|iM 

3. Reverse primer (5' CTGTCAAGTATTTCTCCGCAGCATA 3') (SEQ ID NO:16) 

Final primer concentration 1 .0|iM 
10 4. Genomic DNA template 
Final cone. 1 ng/^1 

Total reaction volume = SOjil 

15 The PGR reaction was conducted in PTC-225 DNA Engine Tetrad (MJ Research) 

using the following PGR profile: 

5 minutes at 94°G 
1 minute at 94''G 
20 1 minute at Se^'G 
1 minute at 72^G 
Repeat from step 2, 34times 
10 minutes at 72°G 
4°G constant hold 

25 

The final product of this PGR reaction was a 289 base pair fragment of the human 

i 

genome described above. This fragment contains SNPS at base pairs 35 (G/G) and 234 
(G/T) of the PGR product These two SNPs were the focus of this study. 



30 



Hybridization Protocol 

A. Amine-labeled Oligo was bound to well of assay plate: 
4035AmG 
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5' NH2-(T)23GCCACAATGAATGACAT (SEQIDN0:17) 
4035AmG 

or 5' NH2-(T)23GCCACAATCAATGACAT (SEQ ID NO: 1 8) 

5 150 pmol of amine labeled oligo was bound into the well of COSTAR DNA- 

BIND assay plate (Coming, Inc. cat. #2498) in oligo binding buffer (100^1 of Disodium 
phosphate, 50 mM, and EDTA, 1 mM, pH 8.5). Wells of the assay plate used for 
negative *'no oligo" controls receive lOOjil binding buffer without amine-labeled oligo 
added. The oligo was allowed to bhad by incubating overnight at 4*'C. 

10 

B. Removal of Unbound Oligo from Assay Plate 

Oligo binding buflFer was removed with a 12 channel vacuum apparatus. The 
remaining amine oligo was removed by washing the wells of the assay plate three tunes 
wifli 200^il of IX PBS wash buffer. Washes were added via a 12-channel pipettor and 
15 removed by the technique described above. One wash consists of addition of the 200^1 
wash buffer to the dry assay plate and its removal. All subsequent washes are carried out 
in an identical manner. 

C. Blocking Assay Plate to Prevent Nonspecific Binding 

20 After removal of the third wash, the wells of the assay plate were blocked with 

200^1 blocking buffer (Disodium phosphate, 50 mM, and EDTA, 1 mM, pH 8.5, 3% 
Bovine Serum Albumin). The plate was allowed to block by incubation for 1 hoxir at 
37°C with shaking at 98 rpm. 

25 D. Prehybridization Wash 

The wells were washed once with IX PBS, 200^1 per wash, as described above. 
The wells were then washed once with 200)il of TMAC-B solution (3M Tetramethyl- 
ammonium Chloride (Sigma-Aldrich Inc. product # T 341 1), 50mM Tris pH 8.0, 0.1% 
SDS, ImM EDTA). The final wash was allowed to incubate for 7 minutes at room 

30 temperature while treating the hybridization solution. 

E. Perform Hybridizations 
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Phase I Hybridizatioii 

Hybridization solution was prepared in 96 well PGR plate using the following procedure: 
Contents: 

5 1) 5-10 |j1 of 4035 locus asyxmnetric PGR DNA probe, with estimated concentration of 
80 ng/nl. 

2) 1 5 pmol cold comjjetitor to inhibit bindmg of PGR product with SNP allele not being 
detected: 

4035ColdCompG 5'ATGTCATTGATTGTGGC 3' (SEQIDNO:19) 
10 or 4035ColdCompC 5'ATGTCATTCATTGTGGC 3' (SEQIDNO:20) 

3) The final volume was then brought up to lOOjil such that hybridi2ation occurs in 
TMAC-B solution as described above. 



15 The Hybridization Solution was heated for 7 minutes at 95°C. Then 100|iil of 

hybridization solution was transferred to COSTAR DNA-BIND plate and allowed to 
hybridize overnight at 52^C with shaking at 98 rpm. 

Post-Hybridization Washes for Phase I Hybridization 
20 Following the hybridization the plate was washed three times with TMAC-B 

solution at room temperature. The plate was then incubated 5 minutes at room 
temperature with shaking at 98 rpm. The plate was then washed once with TMAC-B 
solution at 52*^C. with an incubation for five minutes at 52**C with shaking at 98rpm. 

25 Phase n Hybridization 

Contents 

1) 45 pmol Biotinylated Phase II Detection oligo to bmd 2"*^ SNP: 



30 



4035.CB 5' Biotin-TGTATAATCAGAATTAT 3' (SEQ IDN0:21) 
or 4035-TB 5' Biotin-TGTATAATTAGAATTAT 3' (SEQ ID NO:22) 
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2. 90 pmol Phase n cold competitor to inhibit binding of biotinylated oligo to 2"** SNP 
loci containing allele not being detected: 

4035-C 5' TGTATAATCAGAATTAT 3' (SEQ ID NO:23) 
5 or 4035-T 5' TGTATAATTAGAATTAT 3' (SEQ ID NO:24) 

3. The final volume is then brought up to 100 ^il such that hybridization occurs in 
TMAC-B solution as described above. 

10 lOOjil of hybridi2ation- solution was transferred to COSTAR DNA-BIND plate 

and allowed to hybridize overnight at 52°C, with shaking at 98 rpm. 

F. Post-Hybridization Washes for Phase 11 Hybridization 

The plate was washed three times wilh TMAC-B solution at room temperature 
15 and then incubated 5 minutes at room temperature with shaking at 98 rpm. The plate 
was then washed once with TMAC-B solution at 52C and incubate five minutes at 52°C 
with shaking at 98rpm. Then the plate was wash twice with 2X SSC, with the washes 
carried out as described above. Then the plaste was washed once with oligo binding 
buffer. 

20 

Perform Detection " 

200[il of 4C IX Elf Wash (ELF 97 mRNA In Situ Hybridization Kit, Component 
A. Molecular Probes, Eugene, OR, Cat. # E-6605) was added to the plate, the plate was 
incubated 5 miautes at room temperature, and the wash removed. 200|ll1 of Elf Blocking 

25 Buffer (ELF 97 mRNA In Situ Hybridization Kit, Component B) was then added, the 
plate incubated 45 minutes at room temperature, and the block removed. Then, 100|il 
Elf Streptavidm Alkaline Phosphatase Conjugate (ELF 97 mRNA In Situ Hybridization 
Kit, Component J) diluted 1 :50 in Elf Blocking Buffer, was added to the plate. The plate 
was incubated 30 minutes at room temperature with shaking at 98 rpm. 

30 Following the incubation, the plate was washed the plate three tunes with 4C IX 

Elf Wash. Then lOOfil of ELF 97 phosphatase substrate working solution (ELF 97 
mRNA In Situ HybridizationKit, Component D 1 : 10, Component E 1 :500, Component 



wo 01/75163 



PCTAJSOl/10173 



-50- 

F 1 : 500, in Component C) was added to the plate and incubated 10 minutes in the dark. 
Following the incubation, the assay plate fluorescence (excitation 360 nm, emission 535 
nm) was read. 

After 45 minutes, the assay plate was washed once with 4C IX Elf Wash, and the wash 
5 was left in the plate. Then the assay plate fluorescence was read again as above. 

Results: 

Sample Data 

The four gr^hs in Figure 6 represent data generated from the haplotyping of 4 

10 individuals. The signal generated from a negative control well (no PGR product added) 
was subtracted from the signal generated for each of the four wells analyzed for each 
individual. The background-subtracted signal was plotted for each well. From this 
analysis, the determined haplotypes for each individual are as follows: #l-homozygote 
G-C, #2- heterozygote G-T, G-C, #3- heterozygote G-C, C-C, #4-- heterozygote G-T, 

15 C-C. Sequence analysis of several subcioaed products for each individual have 
confirmed these haplotypes. • 

The foregoing written specification is considered to be sufficient to enable one 
skilled in the art to practice the invention. The present invention is not limited in scope 
by the examples provided, since the examples are intended as illustrations of various 

20 aspects of the invention and other ftmctionally eqxiivalent embodiments are within the 
scope of the invention. Various modifications of the invention in addition to those 
shown and described herein will become apparent to those skilled in the art &om the 
foregoing description and fall within the scope of the inventive claims. The advantages 
and objectives of the invention are not necessarily encompassed by each embodiment of 

25 the invention. 

I claim: 
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CLAIMS 

1 . A method for haplotyping, comprising: 

analyzing a first polymorphic locus of a nucleic acid within a sample by 
specifically capturing the nucleic acid on a surface wherein the step of capturing the 
5 nucleic acid on the surface identifies a first allele of a first SNP of the polymorphic 
locus, 

analyzing a second allele of the first SNP of the polymorphic locus by 
specifically capturing the nucleic acid on a surface vy^erein the step of capturing the 
nucleic acid on the surface identifies tiie second allele of the first SNP of the 
1 0 polymorphic locus, 

separately analyzing a second SNP of a polymorphic locus of the nucleic acid 
sample to identify both alleles of the second SNP, and 

determining the haplotype based on the identification of each allele of each SNP. 

15 2. The method of claim 1 , wherein the second SNP is analyzed using a method 

selected fixjm the group consisting of hybridization, primer extension, MALDI TOF, and 
HPLC. 

3. The method of claim 1, wherein the nucleic acid is captured by hybridization 
20 with an ASO, and wherein the ASO is fixed to a surface. 

4. The method of claim 3, wherein a first ASO complementary to a first allele of 
the first SNP and a second ASO complementary to a second allele of the first SNP are 
hybridized to the surface and are used to capture the nucleic acid. 

25 

5. The method of claim 1, wherein the surface is a multiwell dish. 

6. The method of claim 1, wherein the surface is a chip. 
30 7. The method of claun 1 , wherein the surface is a slide. 



8. The method of claim 1, wherein the surface is a bead. 
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9. The method of claim 4, wherein each ASO corresponding to an allele of the 
first SNP further includes a spacer sequence, 

5 10. The method of claim 9, wherein the spacer sequence is selected firom the 

group consisting of a poly-T, poly-A, poly-C, and poly-G. 

1 1 . The method of claim 2, wherein the second SNP is analyzed by hybridization 
of the nucleic acid sample with an ASO complementary to a first allele of the second 

10 SNP and an ASO complementary to a second allele of the second SNP. 

12. The method of claim 1 1 , wherein each of the ASOs corresponding to an 
allele of the second SNP is hybridized independently to the nucleic acid sample. 

15 13. The method of claim 1 1 , wherem at least one of the ASOs complementary to 

an allele of the first SNP and a.t least one of the ASOs complementary to an allele of the 
second SNP contains a fluorescent label or quencher^ the fluorescent label or quencher of 
the two ASOs, being distinct &om one another. 

20 14. The method of claim 2, herein the alleles of ^e second SNP are analyzed 

simultaneously with one another. 

15. The method of claim 1, wherein each of the ASOs complementary to an 
allele of the first SNP and each of the ASOs complementary to an allele of the second 

25 SNP contains a fluorescent label or quencher, the fluorescent label or quencher of each 
of the four ASOs, being distinct from one another. 

16. The method of claim 1, wherein the nucleic acid sample is prepared by PGR 
amplification of a polymorphic locus from a genomic DNA sample. 

30 

17. The method of claim 1, wherein tiie nucleic acid sample is a reduced 
complexity genome. 



wo 01/75163 



PCTAJSOl/10173 



-53- 



18. The method of clahn 1, wh^ein fhe nucleic acid sample is labeled with a 
first label. 

5 19. The method of claim 1 , wherein the presence of one set of alleles at the 

polymorphic locus is associated with a disease and the haplotyping method is performed 
to identify predisposition to the disease. 

20. The method of claim 1, fiirther comprising analyzing a third SNP of a 

10 polymorphic locus of the nucleic acid sample to identify both alleles of the liiird SNP, 
and determining the haplotype based on the identification of each allele of each SNP. 

21 . The method of claim 1, further comprising analy2dng a fourth SNP of a 
polymorphic locus of fhe nucleic acid sample to identify both alleles of the fourth SNP, 

15 and determining the haplotype based on the identification of each allele of each SNP. 

22. The method of claim 1, wherein the analysis of the first and second SNPs are 
performed simultaneously. 

20 23 . The method of claim 1 , wherein the nucleic acid is captured by a method 

selected from the group consisting of OLA, primer extension, and binding partner-ASO 
hybridization. 

24. The meliiod of claim 1, wherein the capture steps for the analysis of the first 
25 and second alleles of the first SNP are performed using different capture methods. 

25. The method of claim 1, wherein the nucleic acid sample is an RNA genome. 

26. The method of claim 1 , wherein the RNA genome is made from cDNA. 

30 

27. The method of claim 1, wherein the nucleic acid sample is genomic DNA. 
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28. The method of claim 1, wherein the nucleic acid sample is a mitochondrial 
genome. 

29. A method for haplotyping, comprising: 

5 analyzing a genotype of a first SNP of a polymorphic locus of a nucleic acid 

within a sample in solution by detecting the presence or absence of a first labeled probe 
which specifically identifies a first putative allele of the SNP and detecting tiie presence 
or absence of a second labeled probe which specifically identifies a second putative 
allele of the SNP, 

10 separating the nucleic acid sample based on the genotype of the first SNP, 

analyzing a second SNP of the polymorphic locus of the separated nucleic acid 
samples to identify the haplotype of the nucleic acid. 

30. The method of claim 29, wherem the analysis of the first SNP is performed 
IS iising fluorescence detection.- 

3 1 . The method of claim 30, wherein the nucleic acid sample is separated using a 
flow cytometry. 

20 32. The method of claim 29, wherein the second SNP is analyzed using a 

method selected from the groiip consisting of hybridization, primer extension, MALDI 
TOF,andHPLC. 

33. The method of claim 29, wherein the nucleic acid sample is prepared by PGR 
25 amplification of a polymorphic locus from a genomic DNA sample. 

34. The method of claim 29, wherein the nucleic acid sample is a reduced 
complexity genome. 

30 35. The method of claim 29, wherem the second SNP is identified using a 

capture reaction and wherein the nucleic acid is captured by a method selected firom the 
group consistmg of OLA, primer extension, and binding partner-ASO hybridization. 
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36. The method of claim 29, wherein the micleic acid sample is an RNA 
genome. 

5 37. The method of claim 29, wherein the RNA genome is made from cDNA. 

38. The method of claim 29, wherein the nucleic acid sample is genomic DNA. 

39. The method of claim 29, wherein the nucleic acid sample is a mitochondrial 
10 genome. 

40. A method for haplotyping, comprising: 

labeling first and second SNPs of a polymorphic locus of a nucleic acid within a 
sample in solution with a first, second, third, and fourth labeled probe which specifically 
15 identifies a first and second putative allele of the first SNP and a first and second 
putative allele of the second SNP respectively, 

separating the labeled nucleic acid sample into single nucleic acid molecules, 
detecting the presence or absence of the first, second, third, and fourth labeled 
probes on the single nucleic acid molecules to identify the haplotype of the nucleic acid. 
20 ^ 

41 . The method of claim 40, wherem the probes are labeled with fluorescence 
molecules. 

42. The method of cl^ 41, wherein each of the fluorescent molecules of the 
25 labeled probes is spectrally distinct. 

43. The method of claim 40, wherein the nucleic acid sample is prepared by 
PGR amplification of a polymorphic locus from a genomic DNA sample. 

30 44. The method of claim 40, wherein the nucleic acid sample is a reduced 

complexity genome. 
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45. The method of claim 40, wherein the nucleic acid sample is an RNA 
genome. 

46. The method of claim 40, wherein the RNA genome is made jfrom cDNA. 

5 

47. The method of claim 40, wherein the nucleic acid sample is genomic DNA. 

48. The method of claim 40, wherein the nucleic acid sample is a mitochondrial 
genome. 

10 

49. A method for haplotyping, comprising: 

performing four hybridization reactions on a nucleic acid sample, each of the four 
hybridization reactions involvii]^ one labeled probe specific for one allele of one of two 
SNPs, each of the labeled probes labeled with a spectrally distinct label and wherein each 
15 label on the probe specific for a first of the two SNPs is a spectral pair with the label on 
each probe spedfic for the second of the two SNPs, 

bringing each of the labeled probes in each hybridization reaction within energy 
transfer distance fix>m one another, 

exciting one of tiie labeled probes in each hybridization reaction, and 
20 detecting electromagnetic radiation released from the other labeled probe as a 

signal, wherein the presence or absence of a signal for each hybridization reaction is an 
indicator of the haplotype of the nucleic acid sample. 

50. The method of claim 49, wherein each hybridization reaction is performed in 
25 a separate vessel. 

5 1 . The method of claim 49, wherein the labeled probes are brought within 
energy transfer proximity of ohe another using binding partners. 

30 52. The method of claim 5 1 , wherein the binding partners are avidin and biotin. 



53. The method of claim 49, wherein the labeled probes are labeled ASOs. 
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54. The method of claim 49, wherein the method is perfonned in solution. 

55. The method of claim 49, wherein the method is performed on a surface. 

5 

56. The method of claim 49, wherein the nucleic acid sample is an RNA 
genome. 

57. The method of claim 49, wherein the RNA genome is made from cDNA. 

10 

58. The method of claim 49, wherein the nucleic acid sample is genomic DNA. 

59. The method of claim 49, wherein the nucleic acid sample is a mitochondrial 
genome. 

15 

60. A kit comprising: 

one or more containers housing: 

a first set of ASOs, wherein the first set of ASOs represents two ASOs, each 
containing one of the two alleles of a first SNP in a polymorphic locus, 
20 a second set of ASOs, wherein the second set of ASOs represents two ASOs, 

each containing one of the two alleles of a second SNP in the polymorphic locus, and 

instructions for performing a hybridization reaction to deteimine a haplotype 
firom a genomic DNA sample using the first and second sets of ASOs. 

25 61. The kit of claim 60, fiirther comprising a set of PGR primers for amplifying 

the polymorphic locus of the genomic DNA sample. 

62. The kit of claim 60, wherein the first set of ASOs are fixed to a surface. 

30 63. The kit of claim 60, wherein the spacer sequence is selected firom the group 

consisting of a poly-T, poly-A, poly-C, and poly-G. 
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64. The kit of claim 60, whemn the second set of ASOs are labeled. 
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STEP 1: SYNTHESIS OF ANTI-SENSE 
SNP1 ALLELE-SPECinC 
OUGONUCLEOTIDES 



STEP 2: ATTACHMENT OF ANTI-SENSE 
SNP1 ALLELE-SPECIFIC OLIGOS TO WELLS 



STEP 3: AMPLinCATION OF THE SNP1/SNP2 
POLYMORPHIC LOCUS FROM 48 INDIVIDUALS 
USING A Cy3-LABELED PGR REACTION 




1 


STEP 4: HYBRIDIZATION OF Cy3-LABELED PGR 
PRODUCTS FROM EACH INDIVIDUAL TO DUPLICATE 
SNP1-(G) ALLELE WELLS AND DUPLICATE 
SNPI-(T) ALLELE WELLS 






STEP 5: SYNTHESIS OF SNP2 ALLELE-SPECIFIC 
OLIGONUCLEOTIDES, Cy5-LABELED 


\ 


f 



STEP 6: PHASE II HYBRIDIZATION 
HYBRIDIZATION OF Cy5-LABELED ALLELE-SPECIFIC 

OLIGONUCLEOTIDES TO PHASE I 
HYBRIDIZATION PLATES 





1 


STEP 7: FLUORESCENT SCANNING AND 
HAPLOTYPE SCORING 



FIG, 1 

SUBSTITUTE SHEET (RULE 26) 
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SEQUENCE LISTING 

<110> Polygenyx, Inc 

<120> High Throughput Methods for Haplotyping 

<130> P0715/7003 (HCL) 

<150> US 60/194,425 
<151> 2000-04-04 

<160> 24 

<170> Patentin version 3.0 

<210> 1 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<220> 

<221>. Artificial sequence 
<222> (1)..(20) 

<223> Synthetic oligonucleotide 
<400> 1 

cctcagtgac atccttgcct 

<210> 2 

<211> 20 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 
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<222> (1)..(20) 

<223> Synthetic Oligonucleotide 



<400> 2 

catgcccatt cttctctggt 20 

<210> 3' 

<211> 31 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> (1)..(31) 

<223> Synthetic Oligonucleotide 
<220> 

<221> misc_feature 

<222> (1)..(1) 

<223> amino group attached 



<400> 3 

tttttttttt tttttagtct cccctttccc t 31 

<210> 4 

<211> 32 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..{32) 

<223> Synthetic Oligonucleotide 
<220> 

<221> miscjeature 
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« 

<222> (1)..(1) 

<223> amino group attached 



<400> 4 

tttttttttt tttttagtct cccactttcc ct 32 

<210> 5 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> {1)..(17) 

<223> Synthetic Oligonucleotide 



<400> 5 

agggtggtgc cagaggt 17 

<210> 6 

<211> 17 

<212> DNA 

<213> Homo sapiens 

<220i 

<221> Artificial Sequence 

<222> (1)..(17) 

<223> Synthetic Oligonucleotide 

<400> 6 

agggtggtac cagaggt 17 

<210> 7 

<211> 20 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> Artificial Sequence 
<222> (1) . . (20) 

<223> Synthetic Oligonucleotide 



<220> 

.<221> misc^feature 

<222> (1)..(1) 

<223> phosphate group attached 



<400> 7 

acttgacagc gagtgtgctg 

<210> 8 

<211> 18 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> (1)..(18) 

<223> Synthetic Oligonucleotide 

<400^ 8 

gtccctttgc tgcgtgac 

<210> 9 

<211> 40 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> (1)..(40) 

<223> Synthetic Oligonucleotide 



wo 01/75163 
<220> 

<221> misc_feature 
<222> 

<223> amino group attached 

<400> 9 

tttttttttt tttttttttt tttcacccaa tggaagccat 

<210> 10 

<211> 40 

<212> DNA 

<213>- Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> {1)..{40) 

<223> Synthetic Oligonucleotide 
<220> 

<221> misc_feature 

<222> (1)..{1) 

<223> amino group attached 

<400> 10 

tttttttttt tttttttttt tttcacccaa tagaagccat 

<210> 11 

<211> 17 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..{17) 

<223> Synthetic Oligonucleotide 
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<400i 11 

aggaaatcgg cagctgt 

<210> 12 

<211> 17 

<212> DMA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..(17) 

<223> Synthetic Oligonucleotide 



-6- 



17 



<400> 12 

aggaaatcag cagctgt 

<210> 13 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> (1)..{17) 

<223> Synthetic Oligonucleotide 



<220> 

<221> misc_f e a tur e 

<222> (1)..(3) 

<223> Biotin attached 

<400> 13 

aggaaatcgg cagctgt 



<210> 14 
<211> 17 
<212> DNA 
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<213> Homo sapiens 
<220> 

<221> Artificial Sequence 

<222> (1)..(17) 

<223> Synthetic Oligonucleotide 
<220> 

<221> misc_feature 

<222> (1)..(3) 

<223> Biotin attached 

<400> 14 

aggaaatcag cagctgt 17 

<210> 15 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..(24) 

<223> Synthetic Oligonucleotide 



<400> 15 

gaacagcaat gcacattacc atgg 



24 



<210> 



16 



<211> 



25 



<212> 



DNA 



<2i3> 



Homo sapiens 



<220> 



<221> 



Artificial Sequence 



<222> 



(1)..(25) 
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<223> Synthetic Oligonucleotide 

.<400> 16 

ctgtcaagta tttctccgca gcata 

<210> 17 

<211> 40 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (l)-.(40) 

<223> Synthetic Oligonucleotide 
<220> 

<221> misc_feature 

<222> (1) . . (1) 

<223> amino group attached 

<400> 17 

tttttttttt tttttttttt tttgccacaa tgaatgacat 

<210> 18 

<211> 40 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..(40) 

<223> Synthetic Oligonucleotide 
<220> 

< 2 2 1 > mi s c_f ea t ur e 

<222> (1)..(1) 
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<223> amino group attached 

<400> 18 

tttttttttt tttttttttt tttgccacaa tcaatgacat 

<210> 19 

<211> 17 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> Artificial Sequence 

<222> (1)..(17) 

<223> Synthetic Oligonucleotide 

<400> 19 

atgtcattga ttgtggc 

<210> 20 

<211> - 17 

<212> DNA 

<213> Homo sapiens 

<220> 

<221:i' Artificial Sequence 

<222> (1)..(17) 

<223> Synthetic Oligonucleotide 

<400> 20 

atgtcattca ttgtggc 

<210> 21 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 
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<221> Artificial Sequence 
<222> (1)..(17). 

<223> Synthetic Oligonucleotide 



<220> 

<221> misc_feature 

<222> (1)..(3) 

<223> biotin attached 



<400> 21 

tgtataatca gaattat 

<210> 22 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> (1).-(17) 

<223> Synthetic Oligonucleotide 
<220> 

<221:^ misc_feature 

<222> (1)..(3) 

<223> Biotin attached 



<400> 22 

tgtataatta gaattat 17 

<210> 23 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 



wo 01/75163 PCT/USOl/10173 



'11- 



<221> Artificial Sequence 
<222> (1)..(17) 

<223> Synthetic Oligonucleotide 



<400> 23 

tgtataatca gaattat 17 

<210> 24 

<211> 17 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> Artificial Sequence 

<222> {1)..(17) 

<223> Synthetic Oligonucleotide 



<400> 24 

tgtataatta gaattat 



17 



